[JavaScript] Detecting Hyphens and Dashes with Regular Expressions Using unicode-regex + Dash_Punctuation
I’ll introduce how to detect delimiter characters like hyphens and dashes with regular expressions using unicode-regex + Dash_Punctuation in JavaScript and Node.js.
There are many types of ”―” characters, and to comprehensively detect these input values with regular expressions, I decided to use the npm package unicode-regex.
Examples:
Even in Google Japanese Input conversion candidates, this many types are displayed on the first page.
const unicodeRegex = require('unicode-regex');
const dashRegex = unicodeRegex({
General_Category: ['Dash_Punctuation']
});
console.log(dashRegex.toString());
// [\\u002d\\u058a\\u05be\\u1400\\u1806\\u2010-\\u2015\\u2e17\\u2e1a\\u2e3a-\\u2e3b\\u2e40\\u301c\\u3030\\u30a0\\ufe31-\\ufe32\\ufe58\\ufe63\\uff0d]
Reference: List of Unicode Characters of Category “Dash Punctuation”
That’s all from the Gemba.