This effectively reverts b380514764 and
c24473ddff, restoring the state at
598cd52272.
Before pushing, please check that what you're about to push is sane.
Check your local commit log and ensure there isn't anything out-of-place
before pushing to mainline. When things like this happen, it wastes
everyone's time. I really don't need this in a week when real work™ is
busting my balls and I'm behind where I want to be with preparing for
MAME release.
Memory management in plib is now alignment-aware. All allocations
respect c++11 alignas. Selected classes like parray and aligned_vector
also provide hints (__builtin_assume_aligned) to g++ and clang.
The alignment optimizations have little impact on the current use cases.
They only become effective on bigger data processing.
What has a measurable impact is memory pooling. This speeds up netlist
games like breakout and pong by about 5%.
Tested with linux, macosx and windows cross builds. All features are
disabled since I can not rule out they may temporarily break more exotic
builds.
Set USE_MEMPOOL to 1 to try this (max 5% performance increase).
For mingw, there is no alignment support. This triggers -Wattribute
errors which due to -Werror crash the build.
- convert macros to c++ code.
- order of device creation should not depend on std lib.
- some state saving cleanup.
- added support for clang-tidy to makefile.
- modifications triggered by clang-tidy-9.
Still some work ahead to separate interface from execution. This is a
preparation to switch to another sparse matrix format easily which may
be better suited for parallel processing.
On the linear algebra side there are some nice additions:
- Two additional sort modes: One tries to obtain a upper left identity
matrix, the other prefers a diagonal band matrix structure. Both deliver
slightly better performance than just sorting.
- Parallel execution analysis for Gaussian elimination and LU solve.
This determines which operations may be done independently.
All of this is not really useful right now. The matrix sizes are below
100 nets. I estimate that we at least need four times more so that CPU
parallel processing overhead pays off. For GPU, add another order. But
it's nice to have code which may scale.
This is an effort to separate netlist creation from netlist execution.
The primary target is to avoid that code which will only run during
execution is able to call setup code and thus create ugly hacks.
This change removes all string extensions like trim, rpad, left, right,
... from pstring and replaces them by function templates.
This aligns a lot better with the intentions of the standard library.
- OPENMP refactored. All OPENMP operations are now templatized in pomp.h
- We don't need thread-safe priority queue. Event code updating analog
outputs now runs outside the parallel code.
(nw)
0: Parallel processing of solvers disabled
1: One processor parallel processing. Can be used to measure OPENMP
overhead
>1: Solve n analog subnets in parallel.
Previously, all available processors were used which caused performance
to degrade on hyperthreading.
[Couriersud]
... for updates. This will make device implementation more flexible and
faster. A nice side-effect is that there was some minor (<5%)
performance increase already. Each input is now assigned a notification
handler. Currently this is update, but going forward this may be a
custom handler. In addition
- fixed MEMPOOL on OSX
- removed dead code
- avoid bit-rot
- added delegate support for emscripten and arm processors
- added delegate support for VS 2015 x64
[Couriersud]
- Fixed crashes on terminals without nets (i.e. connected to a rail)
- Reviewed "FIXMEs" and corrected some minor ones.
- Made m_cur_analog protected.
- Fixed pmf delegates to work with msvc.
- More optimizations to the solver code.
- Started work on a better signal pipeline in nlwav
- Only generate documentation for entities which are documented.
[Couriersud]
Device implementations (all cpp files in netlist/devices) now should
only include nl_base.h.
Netlist implementation sources should only include "net_lib.h".
Refactored netlist.h and netlist.cpp to avoid namespace congestion in
netlist.h.
Fixed VC2015 build. (nw)