sql - Regex / Parse PostgreSQL names -


going through postgersql names , identifiers, i'm trying figure out following...

is possible implement regular expression (or parse otherwise) check whether postgresql name / identifier needs wrapped in double-quotes or not? , if possible, check if unicode-style name or not.

i need inject identifier name sql query, , i'd wrap double quotes if name needs it, because read adding quotes not needed in fact changes name format (see updates below).

update-1:

ultimately, need implement following function:

function preparesqlname(name) {      if (/* regular name*/) {         return name;     }      if (/* unicode name */) {         return 'u&"' + name + '"';     } else {         /* needs double quotes */         return '"' + name + '"';     } } 

one of major points of concern - processing non-english names. example, following valid postgresql (in russian) doesn't require wrapping double quotes:

create table жирафы(вес int, рост int, возраст int, имя text); select имя, вес жирафы order рост; 

update-2:

examples of why such big deal...

create table test1(name text);  -- can queried as: select * test1 select * test1 select * "test1"  -- not as: select * "test1" 

another one:

create table "test2"(name text);  -- can queried as: select * "test2" 

and gets trickier unicode-formatted names, ones require u& in front.

var xregexp = require('xregexp');  var isunquotedidentifier = new xregexp('^[\pl_][\pl\pm_0-9$]*$'); var reservedwords = new set(['a', 'abort', 'abs', 'absent', 'absolute', 'access', 'according', 'action', 'ada', 'add', 'admin', 'after', 'aggregate', 'all', 'allocate', 'also', 'alter', 'always', 'analyse', 'analyze', 'and', 'any', 'are', 'array', 'array_agg', 'array_max_cardinality', 'as', 'asc', 'asensitive', 'assertion', 'assignment', 'asymmetric', 'at', 'atomic', 'attribute', 'attributes', 'authorization', 'avg', 'backward', 'base64', 'before', 'begin', 'begin_frame', 'begin_partition', 'bernoulli', 'between', 'bigint', 'binary', 'bit', 'bit_length', 'blob', 'blocked', 'bom', 'boolean', 'both', 'breadth', 'by', 'c', 'cache', 'call', 'called', 'cardinality', 'cascade', 'cascaded', 'case', 'cast', 'catalog', 'catalog_name', 'ceil', 'ceiling', 'chain', 'char', 'character', 'characteristics', 'characters', 'character_length', 'character_set_catalog', 'character_set_name', 'character_set_schema', 'char_length', 'check', 'checkpoint', 'class', 'class_origin', 'clob', 'close', 'cluster', 'coalesce', 'cobol', 'collate', 'collation', 'collation_catalog', 'collation_name', 'collation_schema', 'collect', 'column', 'columns', 'column_name', 'command_function', 'command_function_code', 'comment', 'comments', 'commit', 'committed', 'concurrently', 'condition', 'condition_number', 'configuration', 'conflict', 'connect', 'connection', 'connection_name', 'constraint', 'constraints', 'constraint_catalog', 'constraint_name', 'constraint_schema', 'constructor', 'contains', 'content', 'continue', 'control', 'conversion', 'convert', 'copy', 'corr', 'corresponding', 'cost', 'count', 'covar_pop', 'covar_samp', 'create', 'cross', 'csv', 'cube', 'cume_dist', 'current', 'current_catalog', 'current_date', 'current_default_transform_group', 'current_path', 'current_role', 'current_row', 'current_schema', 'current_time', 'current_timestamp', 'current_transform_group_for_type', 'current_user', 'cursor', 'cursor_name', 'cycle', 'data', 'database', 'datalink', 'date', 'datetime_interval_code', 'datetime_interval_precision', 'day', 'db', 'deallocate', 'dec', 'decimal', 'declare', 'default', 'defaults', 'deferrable', 'deferred', 'defined', 'definer', 'degree', 'delete', 'delimiter', 'delimiters', 'dense_rank', 'depth', 'deref', 'derived', 'desc', 'describe', 'descriptor', 'deterministic', 'diagnostics', 'dictionary', 'disable', 'discard', 'disconnect', 'dispatch', 'distinct', 'dlnewcopy', 'dlpreviouscopy', 'dlurlcomplete', 'dlurlcompleteonly', 'dlurlcompletewrite', 'dlurlpath', 'dlurlpathonly', 'dlurlpathwrite', 'dlurlscheme', 'dlurlserver', 'dlvalue', 'do', 'document', 'domain', 'double', 'drop', 'dynamic', 'dynamic_function', 'dynamic_function_code', 'each', 'element', 'else', 'empty', 'enable', 'encoding', 'encrypted', 'end', 'end-exec', 'end_frame', 'end_partition', 'enforced', 'enum', 'equals', 'escape', 'event', 'every', 'except', 'exception', 'exclude', 'excluding', 'exclusive', 'exec', 'execute', 'exists', 'exp', 'explain', 'expression', 'extension', 'external', 'extract', 'false', 'family', 'fetch', 'file', 'filter', 'final', 'first', 'first_value', 'flag', 'float', 'floor', 'following', 'for', 'force', 'foreign', 'fortran', 'forward', 'found', 'frame_row', 'free', 'freeze', 'from', 'fs', 'full', 'function', 'functions', 'fusion', 'g', 'general', 'generated', 'get', 'global', 'go', 'goto', 'grant', 'granted', 'greatest', 'group', 'grouping', 'groups', 'handler', 'having', 'header', 'hex', 'hierarchy', 'hold', 'hour', 'id', 'identity', 'if', 'ignore', 'ilike', 'immediate', 'immediately', 'immutable', 'implementation', 'implicit', 'import', 'in', 'including', 'increment', 'indent', 'index', 'indexes', 'indicator', 'inherit', 'inherits', 'initially', 'inline', 'inner', 'inout', 'input', 'insensitive', 'insert', 'instance', 'instantiable', 'instead', 'int', 'integer', 'integrity', 'intersect', 'intersection', 'interval', 'into', 'invoker', 'is', 'isnull', 'isolation', 'join', 'k', 'key', 'key_member', 'key_type', 'label', 'lag', 'language', 'large', 'last', 'last_value', 'lateral', 'lead', 'leading', 'leakproof', 'least', 'left', 'length', 'level', 'library', 'like', 'like_regex', 'limit', 'link', 'listen', 'ln', 'load', 'local', 'localtime', 'localtimestamp', 'location', 'locator', 'lock', 'locked', 'logged', 'lower', 'm', 'map', 'mapping', 'match', 'matched', 'materialized', 'max', 'maxvalue', 'max_cardinality', 'member', 'merge', 'message_length', 'message_octet_length', 'message_text', 'method', 'min', 'minute', 'minvalue', 'mod', 'mode', 'modifies', 'module', 'month', 'more', 'move', 'multiset', 'mumps', 'name', 'names', 'namespace', 'national', 'natural', 'nchar', 'nclob', 'nesting', 'new', 'next', 'nfc', 'nfd', 'nfkc', 'nfkd', 'nil', 'no', 'none', 'normalize', 'normalized', 'not', 'nothing', 'notify', 'notnull', 'nowait', 'nth_value', 'ntile', 'null', 'nullable', 'nullif', 'nulls', 'number', 'numeric', 'object', 'occurrences_regex', 'octets', 'octet_length', 'of', 'off', 'offset', 'oids', 'old', 'on', 'only', 'open', 'operator', 'option', 'options', 'or', 'order', 'ordering', 'ordinality', 'others', 'out', 'outer', 'output', 'over', 'overlaps', 'overlay', 'overriding', 'owned', 'owner', 'p', 'pad', 'parameter', 'parameter_mode', 'parameter_name', 'parameter_ordinal_position', 'parameter_specific_catalog', 'parameter_specific_name', 'parameter_specific_schema', 'parser', 'partial', 'partition', 'pascal', 'passing', 'passthrough', 'password', 'path', 'percent', 'percentile_cont', 'percentile_disc', 'percent_rank', 'period', 'permission', 'placing', 'plans', 'pli', 'policy', 'portion', 'position', 'position_regex', 'power', 'precedes', 'preceding', 'precision', 'prepare', 'prepared', 'preserve', 'primary', 'prior', 'privileges', 'procedural', 'procedure', 'program', 'public', 'quote', 'range', 'rank', 'read', 'reads', 'real', 'reassign', 'recheck', 'recovery', 'recursive', 'ref', 'references', 'referencing', 'refresh', 'regr_avgx', 'regr_avgy', 'regr_count', 'regr_intercept', 'regr_r2', 'regr_slope', 'regr_sxx', 'regr_sxy', 'regr_syy', 'reindex', 'relative', 'release', 'rename', 'repeatable', 'replace', 'replica', 'requiring', 'reset', 'respect', 'restart', 'restore', 'restrict', 'result', 'return', 'returned_cardinality', 'returned_length', 'returned_octet_length', 'returned_sqlstate', 'returning', 'returns', 'revoke', 'right', 'role', 'rollback', 'rollup', 'routine', 'routine_catalog', 'routine_name', 'routine_schema', 'row', 'rows', 'row_count', 'row_number', 'rule', 'savepoint', 'scale', 'schema', 'schema_name', 'scope', 'scope_catalog', 'scope_name', 'scope_schema', 'scroll', 'search', 'second', 'section', 'security', 'select', 'selective', 'self', 'sensitive', 'sequence', 'sequences', 'serializable', 'server', 'server_name', 'session', 'session_user', 'set', 'setof', 'sets', 'share', 'show', 'similar', 'simple', 'size', 'skip', 'smallint', 'snapshot', 'some', 'source', 'space', 'specific', 'specifictype', 'specific_name', 'sql', 'sqlcode', 'sqlerror', 'sqlexception', 'sqlstate', 'sqlwarning', 'sqrt', 'stable', 'standalone', 'start', 'state', 'statement', 'static', 'statistics', 'stddev_pop', 'stddev_samp', 'stdin', 'stdout', 'storage', 'strict', 'strip', 'structure', 'style', 'subclass_origin', 'submultiset', 'substring', 'substring_regex', 'succeeds', 'sum', 'symmetric', 'sysid', 'system', 'system_time', 'system_user', 't', 'table', 'tables', 'tablesample', 'tablespace', 'table_name', 'temp', 'template', 'temporary', 'text', 'then', 'ties', 'time', 'timestamp', 'timezone_hour', 'timezone_minute', 'to', 'token', 'top_level_count', 'trailing', 'transaction', 'transactions_committed', 'transactions_rolled_back', 'transaction_active', 'transform', 'transforms', 'translate', 'translate_regex', 'translation', 'treat', 'trigger', 'trigger_catalog', 'trigger_name', 'trigger_schema', 'trim', 'trim_array', 'true', 'truncate', 'trusted', 'type', 'types', 'uescape', 'unbounded', 'uncommitted', 'under', 'unencrypted', 'union', 'unique', 'unknown', 'unlink', 'unlisten', 'unlogged', 'unnamed', 'unnest', 'until', 'untyped', 'update', 'upper', 'uri', 'usage', 'user', 'user_defined_type_catalog', 'user_defined_type_code', 'user_defined_type_name', 'user_defined_type_schema', 'using', 'vacuum', 'valid', 'validate', 'validator', 'value', 'values', 'value_of', 'varbinary', 'varchar', 'variadic', 'varying', 'var_pop', 'var_samp', 'verbose', 'version', 'versioning', 'view', 'views', 'volatile', 'when', 'whenever', 'where', 'whitespace', 'width_bucket', 'window', 'with', 'within', 'without', 'work', 'wrapper', 'write', 'xml', 'xmlagg', 'xmlattributes', 'xmlbinary', 'xmlcast', 'xmlcomment', 'xmlconcat', 'xmldeclaration', 'xmldocument', 'xmlelement', 'xmlexists', 'xmlforest', 'xmliterate', 'xmlnamespaces', 'xmlparse', 'xmlpi', 'xmlquery', 'xmlroot', 'xmlschema', 'xmlserialize', 'xmltable', 'xmltext', 'xmlvalidate', 'year', 'yes', 'zone']); function preparesqlname(name) {     if (isunquotedidentifier.test(name) && !reservedwords.has(name.touppercase()))         return name;      // let's quote name:     return '"' + name.replace(/"/g, '""') + '"'; }  console.log(preparesqlname('simple')); //simple console.log(preparesqlname('to "quote')); //"to ""quote" console.log(preparesqlname('простий')); //простий console.log(preparesqlname('в "лапки')); //"в ""лапки" 

notes:

  1. we need xregexp module, because built-in regexp class doesn't support testing of unicode categories (\pl a.k.a. p{l} , \pm a.k.a \p{m}). still, detecting letters not 100%-clear (see "details on detecting letters").
  2. list of key words taken here. using list may non-ideal solution, not of these identifiers prohibited use in possible cases (i.e. may quote table name allowed unquoted, e.g. abs). on other hand, task cannot solved ideally without adding namepurpose second parameter preparesqlname function (as of these identifiers allowed use in specific cases only).
  3. you not need special handling of non-ascii, if want non-english letters remain non-escaped (they may non-escaped in select "ім'я" користувачі). thing checked identifiers containing other letters/digits/_/$ should quoted (as "ім'я"). may additionally escape non-ascii characters, if wish:

    var isunquotedidentifier = new regexp('^[a-za-z_][a-za-z_0-9$]*$'); var reservedwords = new set(['a', 'abort', 'abs', 'absent', 'absolute', 'access', 'according', 'action', 'ada', 'add', 'admin', 'after', 'aggregate', 'all', 'allocate', 'also', 'alter', 'always', 'analyse', 'analyze', 'and', 'any', 'are', 'array', 'array_agg', 'array_max_cardinality', 'as', 'asc', 'asensitive', 'assertion', 'assignment', 'asymmetric', 'at', 'atomic', 'attribute', 'attributes', 'authorization', 'avg', 'backward', 'base64', 'before', 'begin', 'begin_frame', 'begin_partition', 'bernoulli', 'between', 'bigint', 'binary', 'bit', 'bit_length', 'blob', 'blocked', 'bom', 'boolean', 'both', 'breadth', 'by', 'c', 'cache', 'call', 'called', 'cardinality', 'cascade', 'cascaded', 'case', 'cast', 'catalog', 'catalog_name', 'ceil', 'ceiling', 'chain', 'char', 'character', 'characteristics', 'characters', 'character_length', 'character_set_catalog', 'character_set_name', 'character_set_schema', 'char_length', 'check', 'checkpoint', 'class', 'class_origin', 'clob', 'close', 'cluster', 'coalesce', 'cobol', 'collate', 'collation', 'collation_catalog', 'collation_name', 'collation_schema', 'collect', 'column', 'columns', 'column_name', 'command_function', 'command_function_code', 'comment', 'comments', 'commit', 'committed', 'concurrently', 'condition', 'condition_number', 'configuration', 'conflict', 'connect', 'connection', 'connection_name', 'constraint', 'constraints', 'constraint_catalog', 'constraint_name', 'constraint_schema', 'constructor', 'contains', 'content', 'continue', 'control', 'conversion', 'convert', 'copy', 'corr', 'corresponding', 'cost', 'count', 'covar_pop', 'covar_samp', 'create', 'cross', 'csv', 'cube', 'cume_dist', 'current', 'current_catalog', 'current_date', 'current_default_transform_group', 'current_path', 'current_role', 'current_row', 'current_schema', 'current_time', 'current_timestamp', 'current_transform_group_for_type', 'current_user', 'cursor', 'cursor_name', 'cycle', 'data', 'database', 'datalink', 'date', 'datetime_interval_code', 'datetime_interval_precision', 'day', 'db', 'deallocate', 'dec', 'decimal', 'declare', 'default', 'defaults', 'deferrable', 'deferred', 'defined', 'definer', 'degree', 'delete', 'delimiter', 'delimiters', 'dense_rank', 'depth', 'deref', 'derived', 'desc', 'describe', 'descriptor', 'deterministic', 'diagnostics', 'dictionary', 'disable', 'discard', 'disconnect', 'dispatch', 'distinct', 'dlnewcopy', 'dlpreviouscopy', 'dlurlcomplete', 'dlurlcompleteonly', 'dlurlcompletewrite', 'dlurlpath', 'dlurlpathonly', 'dlurlpathwrite', 'dlurlscheme', 'dlurlserver', 'dlvalue', 'do', 'document', 'domain', 'double', 'drop', 'dynamic', 'dynamic_function', 'dynamic_function_code', 'each', 'element', 'else', 'empty', 'enable', 'encoding', 'encrypted', 'end', 'end-exec', 'end_frame', 'end_partition', 'enforced', 'enum', 'equals', 'escape', 'event', 'every', 'except', 'exception', 'exclude', 'excluding', 'exclusive', 'exec', 'execute', 'exists', 'exp', 'explain', 'expression', 'extension', 'external', 'extract', 'false', 'family', 'fetch', 'file', 'filter', 'final', 'first', 'first_value', 'flag', 'float', 'floor', 'following', 'for', 'force', 'foreign', 'fortran', 'forward', 'found', 'frame_row', 'free', 'freeze', 'from', 'fs', 'full', 'function', 'functions', 'fusion', 'g', 'general', 'generated', 'get', 'global', 'go', 'goto', 'grant', 'granted', 'greatest', 'group', 'grouping', 'groups', 'handler', 'having', 'header', 'hex', 'hierarchy', 'hold', 'hour', 'id', 'identity', 'if', 'ignore', 'ilike', 'immediate', 'immediately', 'immutable', 'implementation', 'implicit', 'import', 'in', 'including', 'increment', 'indent', 'index', 'indexes', 'indicator', 'inherit', 'inherits', 'initially', 'inline', 'inner', 'inout', 'input', 'insensitive', 'insert', 'instance', 'instantiable', 'instead', 'int', 'integer', 'integrity', 'intersect', 'intersection', 'interval', 'into', 'invoker', 'is', 'isnull', 'isolation', 'join', 'k', 'key', 'key_member', 'key_type', 'label', 'lag', 'language', 'large', 'last', 'last_value', 'lateral', 'lead', 'leading', 'leakproof', 'least', 'left', 'length', 'level', 'library', 'like', 'like_regex', 'limit', 'link', 'listen', 'ln', 'load', 'local', 'localtime', 'localtimestamp', 'location', 'locator', 'lock', 'locked', 'logged', 'lower', 'm', 'map', 'mapping', 'match', 'matched', 'materialized', 'max', 'maxvalue', 'max_cardinality', 'member', 'merge', 'message_length', 'message_octet_length', 'message_text', 'method', 'min', 'minute', 'minvalue', 'mod', 'mode', 'modifies', 'module', 'month', 'more', 'move', 'multiset', 'mumps', 'name', 'names', 'namespace', 'national', 'natural', 'nchar', 'nclob', 'nesting', 'new', 'next', 'nfc', 'nfd', 'nfkc', 'nfkd', 'nil', 'no', 'none', 'normalize', 'normalized', 'not', 'nothing', 'notify', 'notnull', 'nowait', 'nth_value', 'ntile', 'null', 'nullable', 'nullif', 'nulls', 'number', 'numeric', 'object', 'occurrences_regex', 'octets', 'octet_length', 'of', 'off', 'offset', 'oids', 'old', 'on', 'only', 'open', 'operator', 'option', 'options', 'or', 'order', 'ordering', 'ordinality', 'others', 'out', 'outer', 'output', 'over', 'overlaps', 'overlay', 'overriding', 'owned', 'owner', 'p', 'pad', 'parameter', 'parameter_mode', 'parameter_name', 'parameter_ordinal_position', 'parameter_specific_catalog', 'parameter_specific_name', 'parameter_specific_schema', 'parser', 'partial', 'partition', 'pascal', 'passing', 'passthrough', 'password', 'path', 'percent', 'percentile_cont', 'percentile_disc', 'percent_rank', 'period', 'permission', 'placing', 'plans', 'pli', 'policy', 'portion', 'position', 'position_regex', 'power', 'precedes', 'preceding', 'precision', 'prepare', 'prepared', 'preserve', 'primary', 'prior', 'privileges', 'procedural', 'procedure', 'program', 'public', 'quote', 'range', 'rank', 'read', 'reads', 'real', 'reassign', 'recheck', 'recovery', 'recursive', 'ref', 'references', 'referencing', 'refresh', 'regr_avgx', 'regr_avgy', 'regr_count', 'regr_intercept', 'regr_r2', 'regr_slope', 'regr_sxx', 'regr_sxy', 'regr_syy', 'reindex', 'relative', 'release', 'rename', 'repeatable', 'replace', 'replica', 'requiring', 'reset', 'respect', 'restart', 'restore', 'restrict', 'result', 'return', 'returned_cardinality', 'returned_length', 'returned_octet_length', 'returned_sqlstate', 'returning', 'returns', 'revoke', 'right', 'role', 'rollback', 'rollup', 'routine', 'routine_catalog', 'routine_name', 'routine_schema', 'row', 'rows', 'row_count', 'row_number', 'rule', 'savepoint', 'scale', 'schema', 'schema_name', 'scope', 'scope_catalog', 'scope_name', 'scope_schema', 'scroll', 'search', 'second', 'section', 'security', 'select', 'selective', 'self', 'sensitive', 'sequence', 'sequences', 'serializable', 'server', 'server_name', 'session', 'session_user', 'set', 'setof', 'sets', 'share', 'show', 'similar', 'simple', 'size', 'skip', 'smallint', 'snapshot', 'some', 'source', 'space', 'specific', 'specifictype', 'specific_name', 'sql', 'sqlcode', 'sqlerror', 'sqlexception', 'sqlstate', 'sqlwarning', 'sqrt', 'stable', 'standalone', 'start', 'state', 'statement', 'static', 'statistics', 'stddev_pop', 'stddev_samp', 'stdin', 'stdout', 'storage', 'strict', 'strip', 'structure', 'style', 'subclass_origin', 'submultiset', 'substring', 'substring_regex', 'succeeds', 'sum', 'symmetric', 'sysid', 'system', 'system_time', 'system_user', 't', 'table', 'tables', 'tablesample', 'tablespace', 'table_name', 'temp', 'template', 'temporary', 'text', 'then', 'ties', 'time', 'timestamp', 'timezone_hour', 'timezone_minute', 'to', 'token', 'top_level_count', 'trailing', 'transaction', 'transactions_committed', 'transactions_rolled_back', 'transaction_active', 'transform', 'transforms', 'translate', 'translate_regex', 'translation', 'treat', 'trigger', 'trigger_catalog', 'trigger_name', 'trigger_schema', 'trim', 'trim_array', 'true', 'truncate', 'trusted', 'type', 'types', 'uescape', 'unbounded', 'uncommitted', 'under', 'unencrypted', 'union', 'unique', 'unknown', 'unlink', 'unlisten', 'unlogged', 'unnamed', 'unnest', 'until', 'untyped', 'update', 'upper', 'uri', 'usage', 'user', 'user_defined_type_catalog', 'user_defined_type_code', 'user_defined_type_name', 'user_defined_type_schema', 'using', 'vacuum', 'valid', 'validate', 'validator', 'value', 'values', 'value_of', 'varbinary', 'varchar', 'variadic', 'varying', 'var_pop', 'var_samp', 'verbose', 'version', 'versioning', 'view', 'views', 'volatile', 'when', 'whenever', 'where', 'whitespace', 'width_bucket', 'window', 'with', 'within', 'without', 'work', 'wrapper', 'write', 'xml', 'xmlagg', 'xmlattributes', 'xmlbinary', 'xmlcast', 'xmlcomment', 'xmlconcat', 'xmldeclaration', 'xmldocument', 'xmlelement', 'xmlexists', 'xmlforest', 'xmliterate', 'xmlnamespaces', 'xmlparse', 'xmlpi', 'xmlquery', 'xmlroot', 'xmlschema', 'xmlserialize', 'xmltable', 'xmltext', 'xmlvalidate', 'year', 'yes', 'zone']); function preparesqlname(name) {     if (isunquotedidentifier.test(name) && !reservedwords.has(name.touppercase()))         return name;      // let's quote (and possibly escape) name:     var reallyescaped = false;     var escapedname = '';     (var = 0; i!=name.length; ++i) {         var charcode = name.charcodeat(i);         if (charcode<128) {             var char = name.charat(i);             escapedname += char=='"' ? '""' : char;         }         else {             reallyescaped = true;             escapedname += '\\' + ('0000' + charcode.tostring(16)).slice(-4);         }     }     return reallyescaped ? 'u&"' + escapedname + '"' : '"' + escapedname + '"'; }  console.log(preparesqlname('simple')); //simple console.log(preparesqlname('to "quote')); //"to ""quote" console.log(preparesqlname('простий')); //u&"\043f\0440\043e\0441\0442\0438\0439" console.log(preparesqlname('в "лапки')); //u&"\0432 ""\043b\0430\043f\043a\0438" 
  4. i recommend lowercase name calling name = name.tolowercase(); before quoting it. without doing you'll need take care fact hello , hello treated same name, hello world , hello world different (because latter 2 got quoted). doing make code incompatible code doesn't forcibly lowercase names (i.e. won't able access entities non-all-lowercase names contain other letter, digit, underscore or dollar sign).


details on detecting letters:

i'm not sure this, afaik different versions of unicode standard can have different sets of code points letters (i.e. code point considered letter in 1 version of unicode standard may considered non-letter other version of unicode standard). if that's true, you'll weird sql syntax error, if javascript considers code point non-english letter , therefore sends unquoted, while postgresql considers non-letter @ , therefore doesn't expect outside quotes (e.g. due older unicode libraries within postgresql server). overcome this, can either forcibly quote every non-ascii character (by replacing \pl a-za-z), or @ least ensure javascript's definition of letter isn't wider postgresql (by replacing \pl explicit code range list specific unicode version here unicode 6.3.0).

de facto postgresql allows unicode character (except forbidden characters within ascii range , invalid surrogates) used within unquoted identifiers -- not letters/digits/_/$ -- , non-ascii whitespace , unassigned code points in fact allowed (although official documentation says nothing that). done prevent future compatibility issues (e.g. 1 described in strikeout text). reasonable behavior @ our side in case send letters/digits/_/$ postgresql in unquoted form (as official documentation suggests), don't need worry following exact version of unicode standard detecting them.

the question still remaining whether should treat \pl letters, or \pl\pm* (the former may include letter diacritic if it's represented single code point, while latter includes kinds of letters diacritics). de jure official documentation says nothing question, while de facto postgresql accepts unicode character. probably, in such case latter more reasonable (that's why replaced ^[\pl_][\pl_0-9$]*$ original version of first code listing ^[\pl_][\pl\pm_0-9$]*$).

resume:

  1. whether to escape (u&"\044f") or not escape ("я") character within quoted identifier -- it's absolutely matter of taste (per documentation, character except nul may appear non-escaped in quoted identifier, nul isn't allowed within identifiers anyway).
  2. whether to quote ("hello") or not quote (hello) identifier:
    • an identifier must quoted if contains ascii character other english letter, digit 0 9, underscore sign or dollar sign, or if starts ascii character other english letter or underscore sign;
    • an identifier should quoted if contains non-ascii character other letter of language, or without diacritics; de facto such identifier accepted without quoting, such behavior isn't mentioned in documentation , may changed in future;
    • otherwise it's choice whether quote identifier or no; if prefer case insensitivity, may tend first listing (which quoting @ minimum , no escaping), if unlike non-ascii characters, may tend second listing (which quotes , escapes every non-ascii character, officially allowed).

Comments

Popular posts from this blog

php - Wordpress website dashboard page or post editor content is not showing but front end data is showing properly -

javascript - Twitter Bootstrap - how to add some more margin between tooltip popup and element -

javascript - Get parameter of GET request -