Commit Graph

318 Commits

Author SHA1 Message Date
Joachim Schoeberl
fd50131a5b constexpr function 2021-06-28 01:35:23 +02:00
Joachim Schoeberl
91506aa71a static constexpr 2021-06-28 01:07:03 +02:00
Joachim Schoeberl
31d5ce8be9 packed archiving 2021-06-27 12:32:51 +02:00
Joachim Schoeberl
54db7941d0 tuning mesh(un)pickling 2021-06-26 12:14:17 +02:00
Joachim Schoeberl
e84d4e90c8 add header for std::byte 2021-06-25 18:58:25 +02:00
Joachim Schoeberl
72fb819def missing overloads for archive (byte) 2021-06-24 13:21:43 +02:00
Joachim Schoeberl
971d6bb465 little tuning of mesh pickling 2021-06-24 07:39:09 +02:00
Matthias Hochsteger
4b40a7eb31 backward-compatible Timer interface 2021-06-14 14:55:08 +02:00
Matthias Hochsteger
1de1a1800e Fix template argument deduction for Timer, remove ThreadRegionTimer 2021-06-11 15:19:30 +02:00
Matthias Hochsteger
6f7543c7dc Timer - convenience constructors to disable tracing and/or timing
Examples:
Timer t0("name");
Timer t1("name", NoTracing);
Timer t2("name", NoTiming);
Timer t3("name", NoTracing, NoTiming);
Timer t4("name", NoTiming, NoTracing);
2021-06-11 10:08:06 +02:00
Matthias Hochsteger
c5639a5706 Thread-safe Timer
- use template arguments instead of run-time variable 'priority'
 - change in paje interface for tracing
2021-06-11 09:52:58 +02:00
Matthias Hochsteger
a11294baf0 inline GetThreadI() (except on Windows, no dllexport for thread_local variables supported) 2021-06-11 09:51:23 +02:00
Matthias Hochsteger
a96a1e4624 separate memtracer.hpp 2021-06-10 09:38:00 +02:00
Joachim Schoeberl
abb2e43ccb optimize parallel load 2021-06-08 19:08:14 +02:00
Matthias Hochsteger
3ce5b1958e Initialize FlatArray members ( thx @mrambausek ) 2021-06-02 15:45:36 +02:00
Joachim Schoeberl
6dcc89ad04 some table py-features 2021-06-01 12:57:58 +02:00
Joachim Schoeberl
7c4f1cf53a minimal export of Table 2021-05-30 22:15:21 +02:00
Joachim Schoeberl
c3984fcc5b just use pointer for Array - iterators (on proposal of Matthias R) 2021-05-30 18:58:34 +02:00
Joachim Schoeberl
3258b27410 fix initialization order warning 2021-05-30 18:57:14 +02:00
Matthias Hochsteger
db494f4239 more Timers in Mesh 2021-05-12 10:56:21 +02:00
Matthias Hochsteger
4b53c63fba helper functions for table creation 2021-05-10 12:03:20 +02:00
Matthias Hochsteger
acf2b39680 Fix cross-platform archiving
This is a non-backward compatible change for archives on Windows!
2021-04-23 20:06:58 +02:00
Joachim Schoeberl
2d9e32ba70 ArrayMem from BaseArray ctor 2021-04-09 21:30:29 +02:00
Christopher Lackner
daa0985a41 trace memory free only when array owns memory 2021-04-07 09:58:53 +02:00
Christopher Lackner
4fad6e0631 fix pickling on arm, store long type platform independent 2021-04-01 10:48:13 +02:00
Joachim Schoeberl
1f45601387 Array<int> ia(n); ia.Range(2, END-1) 2021-03-29 22:39:57 +02:00
Christopher Lackner
001eaa32b6 DoArchive for LocalH 2021-03-29 13:55:23 +02:00
Stefan Brüns
85e8c09ff6 Fix GetTimeCounter for Aarch64 variants
Neither GCC nor Clang define an __arm64__ preprocessor macro, but use
__aarch64__ (MSVC uses _MARM_64). Add a "64" suffix to the define, i.e.
NETGEN_ARCH_ARM64 to make it more obvious in only refers to aarch64, and
to be in line with NETGEN_ARCH_AMD64.

Replace the (Clang specific) __builtin_readcyclecounter with inline
asm:
- The function return cycles (i.e. varies with CPU frequency), not time
- It may return 0, depending on the PMU settings
- It may cause an illegal instruction, in case it is not trapped by the
  kernel, e.g. on FreeBSD.

Reading the generic timer/counter CNTVCT_EL0 instead of PMCCNTR_EL0 avoids
these pitfalls. The inline asm works on GCC and Clang, instead of
Clang only for the builtin.
2021-03-03 17:30:33 +01:00
Joachim Schoeberl
979a695f62 fixing warnings 2021-02-18 10:30:01 +01:00
Joachim Schoeberl
87e472b6fc start face-hierarchy in Netgen 2021-02-17 14:54:14 +01:00
Christopher Lackner
0c2430f3dc add std::any symboltable to Flags to store arbitrary objects 2021-02-08 15:44:15 +01:00
Joachim Schoeberl
25011c8407 arm-simd: HSum, tuple support 2021-02-05 11:59:03 +01:00
Joachim Schöberl
9a9828d3af some more arm-simds 2021-01-31 16:31:47 +01:00
Joachim Schöberl
18f5a933a9 arm-simd working 2021-01-30 21:02:49 +01:00
Joachim Schöberl
f53c069308 prepare SIMD for arm64 2021-01-30 20:05:28 +01:00
Joachim Schöberl
ea7f6c1e94 fnma intrinsic for avx512 2020-12-22 13:06:08 +01:00
Joachim Schöberl
c1c10174be FNMA asm-instruction 2020-12-22 09:37:21 +01:00
Matthias Hochsteger
94ecf8de92 Fix private linking of Python
The CMake export of Interface libraries also exports PRIVATE build
settings, which leads to build errors with non-existing include paths and .lib files for binary distributions.

Use the work-around mentioned here to circumvent this behavior:
https://gitlab.kitware.com/cmake/cmake/-/issues/15415#note_849405
2020-12-18 11:05:10 +01:00
Joachim Schöberl
d30accdc1a Merge branch 'apple_silicon' into 'master'
Support for Apple M1

See merge request jschoeberl/netgen!359
2020-12-16 20:47:21 +00:00
Matthias Hochsteger
eb6ac164e7 int64_t for masks 2020-12-16 21:00:12 +01:00
Matthias Hochsteger
d97a9a6594 Alignment for generic SIMD classes 2020-12-16 17:20:18 +01:00
Matthias Hochsteger
e68d8cea9b workaround for missing intrinsic on GCC 7 2020-12-16 10:58:01 +01:00
Matthias Hochsteger
9c0dbec8c9 Fix SIMD<mask64> ctor and Unpack 2020-12-15 15:31:17 +01:00
Matthias Hochsteger
dbe894fea3 Support for Apple M1 2020-12-15 10:43:11 +01:00
Matthias Hochsteger
1b55c51da5 remove AlignedAlloc, use alignas 2020-12-15 09:40:43 +01:00
Matthias Hochsteger
1f3aebcec0 Fix AVX-Operators for int64_t simd (use generic ones) 2020-12-15 09:40:43 +01:00
mhochsteger
f213a7a5b1 fix fabs for AVX on Windows 2020-12-14 15:50:27 +01:00
Matthias Hochsteger
248145bbf0 fix wrong simd operators 2020-12-14 12:47:53 +01:00
Matthias Hochsteger
fc44eb95df simd - array and variadic ctor 2020-12-11 23:12:34 +01:00
Matthias Hochsteger
2d667a08dc move (refactored) SIMD headers from ngsolve into ngcore 2020-12-11 20:54:41 +01:00