BACHELOR-FACHPROJEKTE WINTERSEMESTER 2024/2025

Scientific Computing

Content

This module teaches students the relation between scientific computing workloads and efficient strategies to optimize such workloads with regards to modern hardware properties. Towards this, hardware properties and their potential for performance optimization will be considered on the one hand, on the other hand scientific applications will be taken into focus and will be assigned to subgroups as a project to be optimized with respect to the aforementioned properties. The applications, as well as the hardware properties, are discussed in a seminar phase first, afterward subgroups get a specific application assigned and have to apply the previously gained knowledge to optimize the performance of the application.

The considered hardware properties and scientific applications change over time and may be adjusted with respect to student's background and interest. A collection of considerable topics is listed in the following:
Hardware Properties:

Cache Prefetching: The massive latency difference between cache hits and misses makes the proper prefetching of contents a crucial driver for performance. Software can take control over cache prefetching in a direct or indirect manner.
Memory Locality: System caches are managed in blocks, which cover multiple data words. The sequence of memory accesses can be especially cache friendly, when temporally close accesses target memory contents in a reduced number of blocks. This can be controlled by the layouting of application's memory.
SIMD Execution: In the form of Vectorization units, General Purpose GPUs, or special accelerators, systems offer certain degrees of vector processing, which can achieve a high degree of parallelization.
Branch Execution: Branches in the control flow of a program can introduce a pipeline related branch penalty. The improper prediction of branch outcomes can further worsen this effect. By ensuring data independence and improved branch prediction, software can optimize the caused penalty.
Data Typing: Different data types consume different resources. This is the case for different bit width data types, as well as for floating point and fixed point data types. By utilizing the optimal data type in an application, software can control the specific resource consumption.

Scientific Applications:

Tree Based Data Structures: Especially in database systems, tree based structures are used for indexing and storage. Trees, although potentially featuring large memory footprints, require a fraction of their memory during execution. This allows for specific execution optimization.
Matrix Vector Multiplication or Matrix Matrix Multiplication: These operations operate on large amounts of data, which have a local dependency. This allows the realization of an algorithm for these operations to almost arbitrarily control the sequence of memory accesses, allowing for optimization.
Database Operators: Filtering, accumulation or other typical database operators usually apply to large amounts of parallel data, which makes them a beneficial candidate for exploitation of parallel computation.
Decision Trees / Random Forests: As a specific type of machine learning algorithms, decision trees can be implemented in a hardware close manner, making use of the CPUs branching infrastructure. This allows for optimization of the branches.
Quantization of Neural Networks: In neural networks, the quality of prediction is variable and allows for trade-offs between performance and prediction quality. Quantization can force neural networks to use smaller and more efficient data types, which induces a possible loss of the target accuracy.

By car

By train

By plane

The H-Bahn (Suspended Monorail System)

Map