SELECT * FROM vedeo_list WHERE title REGEXP "DUNE|DUNE|dune|デューン";
The REGEXP operator is a special syntax for the regexp() user function. No regexp() user function is defined by default and so use of the REGEXP operator will normally result in an error message. If an application-defined SQL function named "regexp" is added at run-time, then the "X REGEXP Y" operator will be implemented as a call to "regexp(Y,X)".
とあります。 Google翻訳さんにお願いすると、
REGEXP 演算子は、regexp() ユーザー関数の特別な構文です。 デフォルトでは regexp() ユーザー関数は定義されていないため、REGEXP 演算子を使用すると、通常はエラー メッセージが表示されます。 「regexp」というアプリケーション定義の SQL 関数が実行時に追加されると、「X REGEXP Y」演算子が「regexp(Y,X)」の呼び出しとして実装されます。
と訳してくれました。どうやらユーザ関数を定義しないとだめっぽいです。 一応 ext/misc/regexp.c というファイルはソースツリーにあるので、DLLかなんかの形であるのかな?とPythonのディレクトリを見てみると、DLLsディレクトリに sqlite3.dll と regexp.dll がありました。これを dumpbin してみると、
>dumpbin /exports regexp.dll Microsoft (R) COFF/PE Dumper Version 14.34.31937.0 Copyright (C) Microsoft Corporation. All rights reserved. Dump of file C:\Apps\Python311\DLLs\regexp.dll File Type: DLL Section contains the following exports for regexp.dll 00000000 characteristics FFFFFFFF time date stamp 0.00 version 1 ordinal base 1 number of functions 1 number of names ordinal hint RVA name 1 0 00003160 sqlite3_regexp_init Summary 2000 .data 2000 .pdata A000 .rdata 1000 .reloc 10000 .text 1000 _RDATAとなっているので、このDLLを読み込んでやればきっといいんでしょうね。
An SQLite extension is a shared library or DLL. To load it, you need to supply SQLite with the name of the file containing the shared library or DLL and an entry point to initialize the extension. In C code, this information is supplied using the sqlite3_load_extension() API. See the documentation on that routine for additional information.
Note that different operating systems use different filename suffixes for their shared libraries. Windows uses ".dll", Mac uses ".dylib", and most unixes other than mac use ".so". If you want to make your code portable, you can omit the suffix from the shared library filename and the appropriate suffix will be added automatically by the sqlite3_load_extension() interface.
There is also an SQL function that can be used to load extensions: load_extension(X,Y). It works just like the sqlite3_load_extension() C interface.
Both methods for loading an extension allow you to specify the name of an entry point for the extension. You can leave this argument blank - passing in a NULL pointer for the sqlite3_load_extension() C-language interface or omitting the second argument for the load_extension() SQL interface - and the extension loader logic will attempt to figure out the entry point on its own. It will first try the generic extension name "sqlite3_extension_init". If that does not work, it constructs a entry point using the template "sqlite3_X_init" where the X is replaced by the lowercase equivalent of every ASCII character in the filename after the last "/" and before the first following "." omitting the first three characters if they happen to be "lib". So, for example, if the filename is "/usr/lib/libmathfunc-4.8.so" the entry point name would be "sqlite3_mathfunc_init". Or if the filename is "./SpellFixExt.dll" then the entry point would be called "sqlite3_spellfixext_init".
For security reasons, extension loading is turned off by default. In order to use either the C-language or SQL extension loading functions, one must first enable extension loading using the sqlite3_db_config(db,SQLITE_DBCONFIG_ENABLE_LOAD_EXTENSION,1,NULL) C-language API in your application.
> sqlite3 SQLite version 3.40.0 2022-11-16 12:10:08 Enter ".help" for usage hints. Connected to a transient in-memory database. Use ".open FILENAME" to reopen on a persistent database. sqlite> .dbconfig defensive off dqs_ddl on dqs_dml on enable_fkey off enable_qpsg off enable_trigger on enable_view on fts3_tokenizer off legacy_alter_table off legacy_file_format off load_extension on no_ckpt_on_close off reset_database off trigger_eqp off trusted_schema on writable_schema off sqlite>下から6番目に "load_extension on" とあります。有効化はされているようです。ところが、CLI から load_extension("regexp.dll") とかやってもエラーになるばかり。
> python Python 3.11.1 (tags/v3.11.1:a7a450f, Dec 6 2022, 19:58:39) [MSC v.1934 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import sqlite3 >>> con=sqlite3.connect("videolist.db") >>> cur=con.cursor() >>> con.enable_load_extension(True) >>> con.load_extension("c:/Apps/python311/DLLs/regexp.dll") >>> cur.execute("SELECT * FROM videolist WHERE title REGEXP 'DUNE|DUNE'") <sqlite3.Cursor object at 0x000001C72971A140> >>> row = cur.fetchall()え、なんかできてるし。 つまり、Python からならこれだけで SQLite3 の正規表現は使える、ということですね。 ついでに、さっき見たソースコードに正規表現ルールが書いてあったので引用しておきます。
The following regular expression syntax is supported: X* zero or more occurrences of X X+ one or more occurrences of X X? zero or one occurrences of X X{p,q} between p and q occurrences of X (X) match X X|Y X or Y ^X X occurring at the beginning of the string X$ X occurring at the end of the string . Match any single character \c Character c where c is one of \{}()[]|*+?. \c C-language escapes for c in afnrtv. ex: \t or \n \uXXXX Where XXXX is exactly 4 hex digits, unicode value XXXX \xXX Where XX is exactly 2 hex digits, unicode value XX [abc] Any single character from the set abc [^abc] Any single character not in the set abc [a-z] Any single character in the range a-z [^a-z] Any single character not in the range a-z \b Word boundary \w Word character. [A-Za-z0-9_] \W Non-word character \d Digit \D Non-digit \s Whitespace character \S Non-whitespace character A nondeterministic finite automaton (NFA) is used for matching, so the performance is bounded by O(N*M) where N is the size of the regular expression and M is the size of the input string. The matcher never exhibits exponential behavior. Note that the X{p,q} operator expands to p copies of X following by q-p copies of X? and that the size of the regular expression in the O(N*M) performance bound is computed after this expansion.非決定論的有限オートマトンなのでパフォーマンスは O(N*M) に制限される、とのことですが、とにかく動くことが大事なので見なかったことにします。 Python の re モジュールとはちょっと違うようですし、PCRE とも違いますが、ここまでわかればなんとかなりそうです。 実際のコードは、
import sqlite3 from sys import exec_prefix conn = sqlite3.connect(db_file) conn.enable_load_extension(True) conn.load_extension(exec_prefix + "/DLLs/regexp.dll") conn.row_factory = sqlite3.Row cur = conn.cursor()みたいに使えばいいでしょう。
0 件のコメント:
コメントを投稿