Stefan Brüns
85e8c09ff6
Fix GetTimeCounter for Aarch64 variants
...
Neither GCC nor Clang define an __arm64__ preprocessor macro, but use
__aarch64__ (MSVC uses _MARM_64). Add a "64" suffix to the define, i.e.
NETGEN_ARCH_ARM64 to make it more obvious in only refers to aarch64, and
to be in line with NETGEN_ARCH_AMD64.
Replace the (Clang specific) __builtin_readcyclecounter with inline
asm:
- The function return cycles (i.e. varies with CPU frequency), not time
- It may return 0, depending on the PMU settings
- It may cause an illegal instruction, in case it is not trapped by the
kernel, e.g. on FreeBSD.
Reading the generic timer/counter CNTVCT_EL0 instead of PMCCNTR_EL0 avoids
these pitfalls. The inline asm works on GCC and Clang, instead of
Clang only for the builtin.
2021-03-03 17:30:33 +01:00
Joachim Schoeberl
979a695f62
fixing warnings
2021-02-18 10:30:01 +01:00
Joachim Schoeberl
87e472b6fc
start face-hierarchy in Netgen
2021-02-17 14:54:14 +01:00
Christopher Lackner
0c2430f3dc
add std::any symboltable to Flags to store arbitrary objects
2021-02-08 15:44:15 +01:00
Joachim Schoeberl
25011c8407
arm-simd: HSum, tuple support
2021-02-05 11:59:03 +01:00
Joachim Schöberl
9a9828d3af
some more arm-simds
2021-01-31 16:31:47 +01:00
Joachim Schöberl
18f5a933a9
arm-simd working
2021-01-30 21:02:49 +01:00
Joachim Schöberl
f53c069308
prepare SIMD for arm64
2021-01-30 20:05:28 +01:00
Joachim Schöberl
ea7f6c1e94
fnma intrinsic for avx512
2020-12-22 13:06:08 +01:00
Joachim Schöberl
c1c10174be
FNMA asm-instruction
2020-12-22 09:37:21 +01:00
Matthias Hochsteger
94ecf8de92
Fix private linking of Python
...
The CMake export of Interface libraries also exports PRIVATE build
settings, which leads to build errors with non-existing include paths and .lib files for binary distributions.
Use the work-around mentioned here to circumvent this behavior:
https://gitlab.kitware.com/cmake/cmake/-/issues/15415#note_849405
2020-12-18 11:05:10 +01:00
Joachim Schöberl
d30accdc1a
Merge branch 'apple_silicon' into 'master'
...
Support for Apple M1
See merge request jschoeberl/netgen!359
2020-12-16 20:47:21 +00:00
Matthias Hochsteger
eb6ac164e7
int64_t for masks
2020-12-16 21:00:12 +01:00
Matthias Hochsteger
d97a9a6594
Alignment for generic SIMD classes
2020-12-16 17:20:18 +01:00
Matthias Hochsteger
e68d8cea9b
workaround for missing intrinsic on GCC 7
2020-12-16 10:58:01 +01:00
Matthias Hochsteger
9c0dbec8c9
Fix SIMD<mask64> ctor and Unpack
2020-12-15 15:31:17 +01:00
Matthias Hochsteger
dbe894fea3
Support for Apple M1
2020-12-15 10:43:11 +01:00
Matthias Hochsteger
1b55c51da5
remove AlignedAlloc, use alignas
2020-12-15 09:40:43 +01:00
Matthias Hochsteger
1f3aebcec0
Fix AVX-Operators for int64_t simd (use generic ones)
2020-12-15 09:40:43 +01:00
mhochsteger
f213a7a5b1
fix fabs for AVX on Windows
2020-12-14 15:50:27 +01:00
Matthias Hochsteger
248145bbf0
fix wrong simd operators
2020-12-14 12:47:53 +01:00
Matthias Hochsteger
fc44eb95df
simd - array and variadic ctor
2020-12-11 23:12:34 +01:00
Matthias Hochsteger
2d667a08dc
move (refactored) SIMD headers from ngsolve into ngcore
2020-12-11 20:54:41 +01:00
Christopher Lackner
cb0d8295bf
fix hashing of bitarray (uninitialized value in HashArchive & random
...
values at end)
2020-11-25 22:07:07 +01:00
Matthias Hochsteger
91f127ef71
memory tracer - fix memory accumulation of children
2020-11-25 14:34:29 +01:00
Matthias Hochsteger
b55264e0ee
memory tracing - handle multiple consecutive tracers correctly
2020-11-24 19:20:21 +01:00
Matthias Hochsteger
efdc57885a
memory tracing - store parents array instead of children table
2020-11-24 17:12:39 +01:00
Christopher Lackner
922ad16213
if more memory is deallocated than allocated set memtracer to 0 not
...
negative values
2020-11-21 22:32:41 +01:00
Christopher Lackner
a69cdc9000
mem tracing compile time option, simplify by MemoryTracer as member
2020-11-21 15:49:07 +01:00
Matthias Hochsteger
87623981a6
export PajeTrace.WriteMemoryChart() to python
2020-11-19 19:29:04 +01:00
Matthias Hochsteger
f0152baacf
mem tracing - TraceMemorySwap helper function
2020-11-19 17:35:29 +01:00
Matthias Hochsteger
6f98123e98
mem tracing - use topological sorting, some fixes
2020-11-19 16:16:39 +01:00
Matthias Hochsteger
a17066a387
html chart for peak memory consumption, some Array tracing fixes
2020-11-19 14:57:45 +01:00
Matthias Hochsteger
f143995f27
clean up memory tracing
2020-11-18 21:45:00 +01:00
Matthias Hochsteger
1a93fb3fa5
first attempt on memory tracing
2020-11-18 20:20:35 +01:00
Joachim Schöberl
cddfb4a0b5
fixing delaunay2d point search, non-parallel for small meshes
2020-10-26 11:20:12 +01:00
Joachim Schöberl
bfbef51996
remove bitarray in delaunay2d, just one hashtable position
2020-10-23 19:40:47 +02:00
Matthias Hochsteger
832485e41a
pybind11 compatibility
2020-10-22 12:11:19 +02:00
Matthias Hochsteger
b3d757ccd1
update pybind11 to 2.6.0rc3
2020-10-17 17:58:38 +02:00
Matthias Hochsteger
6544fbeca6
sunburst chart - tooltip formatting
2020-10-14 11:52:34 +02:00
Matthias Hochsteger
14e6a1d24b
more statistics in sunburst chart
2020-10-14 11:52:26 +02:00
Matthias Hochsteger
25efdadd05
helper macro for Timer/RegionTimer definition
2020-10-13 11:11:33 +02:00
Joachim Schöberl
b5a9580a8e
BitArray::Data
2020-10-08 21:27:16 +02:00
Matthias Hochsteger
2629208f38
pajetrace - fix Timer names in MPI-trace
2020-10-08 12:20:46 +02:00
Matthias Hochsteger
7a1344bfcb
cmake variable NG_COMPILE_FLAGS to set additional compile options
2020-10-01 13:35:53 +02:00
Christopher Lackner
1666155d25
add range adaptors (filter, transform)
2020-09-19 17:39:03 +02:00
Joachim Schöberl
283db5c637
trange bracket with size_t for T_Range
2020-09-19 09:43:00 +02:00
Joachim Schöberl
8b5675a8e2
check if mpi is initialized
2020-09-15 23:16:04 +02:00
Christopher Lackner
c7af26771e
fix bug in BitArray==
2020-09-11 16:54:25 +02:00
Joachim Schöberl
65761e7768
check copy_assignable also in copy-constructor
2020-09-09 07:03:12 +02:00