Ticket Hash: | 2bfd91266ad9202517f7cb798c54c680f85e6278 | ||
Title: | CPU is burned when parallel work is done | ||
Status: | Open | Type: | Performance |
Severity: | Important | Priority: | Medium |
Subsystem: | Integration_Via_PInvoke | Resolution: | Under_Review |
Last Modified: |
2021-05-04 07:21:20 4.05 years ago |
Created: |
2021-05-02 22:24:56 4.05 years ago |
Version Found In: | newest and several years back |
User Comments: | ||||
anonymous added on 2021-05-02 22:24:56:
The problem is hard to reproduce. It seems it gets easier the faster the CPU is. While a 10 years old i7 just doesn't scale well, a modern Ryzen or Xeon cpu falls back below single core performance on parallelism of 4 or higher! But still the full CPU is getting burned: The more CPUs you insert into the system the slower the total gets. I hope I can attach the test case later - currently there is no button to do so. The test case in abstract:
I compared many versions of System.Data.SQLite (SDS) and Microsoft.Data.Sqlite (MDS): All time the same: SDS has problems and MDS just works like a charm - so I'm pretty sure the problem is not located in native SQLite but somewhere in this wrapper; maybe the nested while(true)->for(;;)-loops - but I don't know for sure. But this doesn't help me because my app consists of thousands of queries going through EntityFramework6 - there is no fast transition to EFcore. And SDS provides many additional features like Module Concrete values (on an i7 with 4 cores (8 HT)) using a 65GB database using SDS 1 concurrent thread : 13 loops per thread -> 13 total (2s CPU time) 2 concurrent threads: 7-8 loops per thread -> 15 total (4s CPU time) 3 concurrent threads: 5 loops per thread -> 15 total (5.5s CPU time) In contrast when using MDS: 1 concurrent thread : 13 loops per thread -> 13 total (2s CPU time) 2 concurrent threads: 13 loops per thread -> 26 total (4s CPU time) 3 concurrent threads: 12 loops per thread -> 36 total (6s CPU time) (the i7 doesn't fall below single thread performance; I'll post values for a Ryzen7 tomorrow) Note: Not every query performs bad, a simple "select * from Table limit 10" might scale well. Feel free to contact me by mail. This is a very catastrophic issue for my app and solving this may be useful for many others, too. *) I noticed this 4 thread-limit years ago, but only worked against one single DB at that time, where it wasn't a real problem. To my shame I never investigated further till now: The test case uses one private connection to one private DB(copy) and it still doesn't scale.. well, the CPU usage scales, keeps at 100% while the total work done decreases the more parallel threads are working (each using its own db); I assume there is some spin-lock which spins far too much. mistachkin added on 2021-05-02 23:30:57: Do you know where the excess CPU is being consumed? Is it for some busy waits, e.g. in SQLite3.Prepare, etc? What is the version of Microsoft.Data.Sqlite are you using? Was it obtained from NuGet? I assume, based on the version of System.Data.SQLite, that you are using the version from NuGet verbatim? Is that correct? What is the target framework for your project? Also, is the processor architecture x86 or x64? mistachkin added on 2021-05-02 23:41:55: Since this is a longstanding issue, it will NOT block the 1.0.114.0 release. anonymous added on 2021-05-03 07:53:33: All threads are always at the same position when I pause the debugger: - SDS.UnsafeNativeMethods.sqlite3_step (stmt) - (Unmanaged code) - SDS.SQLite3.Step (stmt) - SDS.SQLiteDataReader.NextResult() This is also true for the queries which run fast. I tried multiple versions of each MDS (6-preview3, 5.0.5, 3.1.14) and SDS downto 1.0.111 : 3.1.14 and 1.0.111 even are using the same SQLite 3.28.0 version. All from NuGet.org (just see the attached .linq files) - architecture is always x64 and I tried both netcore3.1 and net5. I'll attach two screenshots which show the difference between SDS and MDS on a Ryzen7 8core/16HT machine (Intel Xeons behaving the same) - exactly same DBs and query, everything loaded into RAM before. anonymous added on 2021-05-03 11:28:22: When I break into MDS while all threads are busy, all have this callstack, where "(Unmanaged code)" is always on top (while SDS always has one stack frame on top of "(Unmanaged code)" - this might be either more debug symbols, or some kind of callback?
mistachkin added on 2021-05-03 20:31:15: The output from "PRAGMA compile_options" may be revealing here as it sounds like the core native library may have been compiled with different options. anonymous added on 2021-05-03 21:42:52: SDS: msvc-1900 MDS: msvc-1922 (3.1.14) and msvc-1928 (6.0.0-preview3) mistachkin added on 2021-05-04 01:26:17: Is that really the complete lists of compile options? Perhaps that is only the first one? There should be many more than that, even in the simplest of build configurations. anonymous added on 2021-05-04 07:21:20: Oops, sorry, I only executed a scalar and thought the number is a bit-flag combination. latest MDS (SQLitePCLRaw.lib.e_sqlite3 2.0.5-pre20210119130047): COMPILER=msvc-1928 DEFAULT_FOREIGN_KEYS ENABLE_COLUMN_METADATA ENABLE_FTS3_PARENTHESIS ENABLE_FTS4 ENABLE_FTS5 ENABLE_JSON1 ENABLE_RTREE THREADSAFE=1 SDS 1.0.113.7: COMPILER=msvc-1900 ENABLE_API_ARMOR ENABLE_COLUMN_METADATA ENABLE_DBSTAT_VTAB ENABLE_FTS3 ENABLE_LOAD_EXTENSION ENABLE_MEMORY_MANAGEMENT ENABLE_PREUPDATE_HOOK ENABLE_RTREE ENABLE_SESSION ENABLE_STAT4 ENABLE_STMTVTAB MAX_ATTACHED=30 SOUNDEX THREADSAFE=1 USE_URI WIN32_MALLOC |