New Revision of Cork Computational Geometry Library – runs on Linux !

I have had enough free time lately to return to Cork and have made a couple key improvements to the build :

  • Builds and runs on Linux !
  • Generally 10% faster !
  • Moved to CMake build system
  • Script available to build 3rd Party dependencies
  • Updated to C++20
  • Updated to most recent Boost, TBB and MPIR libraries
  • Started vectorization with AVX2 SIMD instruction set
  • A few improvements to the regression test app
  • Added a few more unit tests
  • Faster OFF file output

Combined this makes for a much smoother ‘getting started’ experience. I will publish a Packer script that can be used to create a Ubuntu Mate 20.04 VM in VirtualBox or Proxmox for development.

The Github repository is here : https://github.com/stephanfr/Cork.git At present I am working in the v0.9.0 branch.

I plan to move forward and bring the 3rd party dependencies up to date and build out more unit tests while working on performance improvements. I believe there are a number of places in the code that will benefit from AVX2 vectorization.

Serial and SIMD implementation of the Xoshiro256+ random number generator – Part 1 Implementation and Usage

The Xoshiro256PlusSIMD project provides a C++ implementation of Xoshiro256+ random number generator that matches the performance of the reference C implementation of David Blackman and Sebastiano Vigna (https://prng.di.unimi.it/). Xoshiro256+ combines high speed, small memory space requirements for stored state and excellent statistical quality. For cryptographic use cases or use cases where absolutely the best statistical quality is required – maybe consider a different RNG like the Mersenne Twist. For any any other conventional simulation or testing use case, Xoshiro256+ should be perfectly fine statistically and better than a whole lot of other slower alternatives.

This implementation is a header-only library and provides the following capabilities:

  • Single 64 bit unsigned random value
  • Single 64 bit unsigned random value reduced to a [lower, upper) range
  • Four 64 bit unsigned random values
  • Four 64 bit unsigned random values reduced to a [lower, upper) range
  • Single double length real random value in a range of (0,1)
  • Single double length real random value in a (lower, upper) range
  • Four double length real random values in a range of (0,1)
  • Four double length real random values in a (lower, upper) range

Implementation Details

For platforms supporting the AVX2 instruction set, the RNG can be configured to use AVX2 instructions or not on an instance by instance basis. AVX2 instructions are only used for the four-wide operations, there is no advantage using them for single value generation.

The four-wide operations use a different random seed per value and the the seed for single value generation is distinct as well. The same stream of values will be returned by the serial and AVX2 implementations. It might be faster for the serial implementation to use only a single seed across all the four values – each increasing index being the next value in a single series, instead of each of the four values having its unique series. The downside of that approach is that the serial implementation would return different four wide values than the AVX2 implementation. The AVX2 implementation must use distinct seeds for each of the four values.

The random series for each of the four-wide values are separated by 2^192 values – i.e. a Xoshiro256+ ‘long jump’ separates the seed for each of the four values. For clarity, the Xoshiro256+ has a state space of 2^256.

The reduction of the uint64s to an integer range takes uint32 bounds. This is a significant reduction in the size of the random values but permits reduction while avoiding taking a modulus. If you have a need for random integer values beyond uint32 sizes, I’d suggest taking the full 64 bit values and applying your own reduction algorithm. The modulus approach to reduction is slower than the approach in the code which uses shifts and a multiply.

Finally, the AVX versions are coded explicitly with AVX intrinsics, there is no reliance on the vageries of compiler vectorization. The SIMD version could be written such that gcc should unroll loops and vectorize but others have reported that it is necessary to tweak optimization flags to get the unrolling to work. For these implementations, all that is needed is to have the -mavx2 compiler option and the AVX2_AVAILABLE symbol defined.

Usage

The class Xoshiro256Plus is a template class and takes an SIMDInstructionSet enumerated value as its only template parameter. SIMDInstructionSet may be ‘NONE’, ‘AVX’ or ‘AVX2’. The SIMD acceleration requires the AVX2 instruction set and uses ‘if contexpr’ to control code generation at compile time. There is also a preprocessor symbol AVX2_AVAILABLE which must be defined to permit AVX2 instances of the RNG to be created. It it completely reasonable to have the AVX2 instruction set available but still use an RNG instance with no SIMD acceleration.

#define __AVX2_AVAILABLE__

#include "Xoshiro256Plus.h"

constexpr size_t NUM_SAMPLES = 1000;
constexpr uint64_t SEED = 1;

typedef SEFUtility::RNG::Xoshiro256Plus Xoshiro256PlusSerial;
typedef SEFUtility::RNG::Xoshiro256Plus Xoshiro256PlusAVX2;

bool InsureFourWideRandomStreamsMatch()
{
    Xoshiro256PlusSerial serial_rng(SEED);
    Xoshiro256PlusAVX2 avx_rng(SEED);

    for (auto i = 0; i < NUM_SAMPLES; i++)
    {
        auto next_four_serial = serial_rng.next4( 200, 300 );
        auto next_four_avx = avx_rng.next4( 200, 300 );

        if(( next_four_serial[0] != next_four_avx[0] ) ||
           ( next_four_serial[1] != next_four_avx[1] ) ||
           ( next_four_serial[2] != next_four_avx[2] ) ||
           ( next_four_serial[3] != next_four_avx[3] ))
        { return false; }
    }

    return true;
}

Struct Timespec Utilities

There are a number of different representations for time in C and C++, ‘struct timespec’ was added in C++11 to provide a representation of times that range beyond a simple integer. A ‘struct timespec’ contains two long fields, one for seconds and another for nanoseconds. Unlike the std::chrono classes, there are no literal operators or other supporting functions for timespec. C++17 has added std::timespec but that still lacks literals and other operators.

Examples

The utilities can be found in my github repository:

https://github.com/stephanfr/TimespecUtilities

Including the ‘TimespecUtilities.hpp’ header file is all that is required. Literal suffix operators ‘_s’ for seconds and ‘_ms’ for milliseconds are defined as well as addition, subtraction and scalar multiplication operators. Examples follow:

const struct timespec five_seconds = 5_s;
const struct timespec one_and_one_half_seconds = 1.5_s;
const struct timespec five_hundred_milliseconds = 500_ms;

const struct timespec six_and_one_half_seconds = five_seconds + one_and_one_half_seconds;
const struct timespec three_and_one_half_seconds = five_seconds - one_and_one_half_seconds;

const struct timespec ten_seconds = five_seconds * 2;
const struct timespec fifty_milliseconds = five_hundred_milliseconds * 0.1;

HeapWatcher : Memory Leak Detector for Automated Testing


This project provides a simple tool for tracking heap allocations between start/finish points in C++ code. It is intended for use in unit test and perhaps some feature tests. It is not a replacement for Valgrind or other memory debugging tools – the primary intent is to provide an easy-to-use tool that can be added to unit tests built with GoogleTest or Catch2 to find leaks and provide partial or full stack dumps of leaked allocations.

The project can be found in github at: https://github.com/stephanfr/HeapWatcher

Design

The C standard library functions of malloc(), calloc(), realloc() and free() are ‘weak symbols‘ in glibc and can be replaced by user-supplied functions with the same signatures supplied in a user static library or shared object. This tool wraps the c standard library calls and then tracks all allocations and frees in a map. The ‘book-keeping’ is performed in a separate thread to (1) limit the need for mutexes or critical sections to protect shared state and (2) limit the run-time performance impact on the code under test. The functions in HeapWatcher are not intrusive in that they simply delegate to the glibc functions and then track allocations in a separate data structure. Allocation tracking can be paused in any thread being tracked and there is a facility to capture stack traces for ‘intentional leaks’ and then ignore those for tracking purposes.

There exists a single global static instance of HeapWatcher which can be accessed with the SEFUtility::HeapWatcher::get_heap_watcher() function.

Additionally, there are a pair of multi-threaded test fixtures provided in the project. One fixture launches workload threads and requires the user to manage the heap watcher. The second test fixture integrates the heap watcher and tracks all allocations made while the instance of the fixture itself is in scope.

For memory intensive applications running on many cores, the single tracker thread may be insufficient. All allocation records go into a queue, will not be lost and will eventually be processed. Potential problems can arise if the application allocates faster than the single thread can keep up and the queue used for passing the records to the tracker thread grows to the point that it exhausts system memory. When the HeapWatcher stops, the memory snapshot it returns is the result of processing all allocation records – so it should be correct.

Including into a Project

Probably the easiest way to use HeapWatcher is to include it through the fetch mechanism provided by CMake:

FetchContent_Declare(
    heapwatcher
    GIT_REPOSITORY "https://github.com/stephanfr/HeapWatcher.git" )

FetchContent_MakeAvailable(heapwatcher)

include_directories(
    ${heapwatcher_SOURCE_DIR}/include
    ${heapwatcher_BIN_DIR}
)

The CMake specification for HeapWatcher will build the library which muct be linked into your peoject. In addition, for the call stack decoding to work properly, the following linker option must be included in your project as well:

SET(CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} -rdynamic")


HeapWatcher is not a header-only project, the linker must have concrete instances of malloc(), calloc(), realloc() and free() to link to the rest of the code under test. Given the ease of including the library with CMake, this doesn’t present much of a problem overall.

Using HeapWatcher


Only a single header file HeapWatcher.hpp must be included in any file wishing to use the tool. This header contains all the data structures and classes needed to use the tool. The HeapWatcher class itself is fairly simple and the call to retrieve the global instance is trivial :

namespace SEFUtility::HeapWatcher
{
    class HeapWatcher
    {
        public:
            virtual void start_watching() = 0;
            virtual HeapSnapshot stop_watching() = 0;

            [[nodiscard]] virtual PauseThreadWatchGuard pause_watching_this_thread() = 0;
            
            virtual uint64_t capture_known_leak(std::list<std::string>& leaking_symbols, std::function<void()> function_which_leaks) = 0;
            [[nodiscard]] virtual const KnownLeaks known_leaks() const = 0;

            [[nodiscard]] virtual const HeapSnapshot snapshot() = 0;
            [[nodiscard]] virtual const HighLevelStatistics high_level_stats() = 0;
    };

    HeapWatcher& get_heap_watcher();
}


Note the namespace declaration. There are a number of other classes declared in the HeapWatcher.cpp header for the HeapSnapshot and to provide the pause watching capability. A simple example of using HeapWatcher in a Catch2 test appears below:

void OneLeak() { int* new_int = static_cast(malloc(sizeof(int))); }

void OneLeakNested() { OneLeak(); }
   
TEST_CASE("Basic HeapWatcher Tests", "[basic]")
{
    SECTION("One Leak Nested", "[basic]")
    {
        SEFUtility::HeapWatcher::get_heap_watcher().start_watching();

        OneLeakNested();

        auto leaks(SEFUtility::HeapWatcher::get_heap_watcher().stop_watching());

        REQUIRE(leaks.open_allocations().size() == 1);

        REQUIRE_THAT(leaks.open_allocations()[0].stack_trace()[0].function(), Catch::Matchers::Equals("OneLeak()"));
        REQUIRE_THAT(leaks.open_allocations()[0].stack_trace()[1].function(),
                    Catch::Matchers::Equals("OneLeakNested()"));

        REQUIRE(leaks.high_level_statistics().number_of_mallocs() == 1);
        REQUIRE(leaks.high_level_statistics().number_of_frees() == 0);
        REQUIRE(leaks.high_level_statistics().number_of_reallocs() == 0);
        REQUIRE(leaks.high_level_statistics().bytes_allocated() == sizeof(int));
        REQUIRE(leaks.high_level_statistics().bytes_freed() == 0);
    }
}

Capturing Known Leaks

In various third party libraries there exist intentional leaks. A good example is the leak of a pointer for thread local storage for each thread created by the pthread library. There is a leak from the symbol ‘dl_allocate_tls‘ that appears to remain even after std::thread::join() is called. This appears not infrequently in Valgrind reports as well. Given the desire to make this a library for automated testing, there is the capability to capture and then ignore allocations from certain functions or methods. An example appears below:

SECTION("Known Leak", "[basic]")
{
    std::list<std::string> leaking_symbol({"KnownLeak()"});

    REQUIRE( SEFUtility::HeapWatcher::get_heap_watcher().capture_known_leak(leaking_symbol, []() { KnownLeak(); }) == 1 );

    REQUIRE(SEFUtility::HeapWatcher::get_heap_watcher().known_leaks().addresses().size() == 2);
    REQUIRE_THAT(SEFUtility::HeapWatcher::get_heap_watcher().known_leaks().symbols()[0].function(),
                 Catch::Matchers::Equals("_dl_allocate_tls"));
    REQUIRE_THAT(SEFUtility::HeapWatcher::get_heap_watcher().known_leaks().symbols()[1].function(),
                 Catch::Matchers::Equals("KnownLeak()"));

    SEFUtility::HeapWatcher::get_heap_watcher().start_watching();

    OneLeakNested();
    KnownLeak();
    OneLeak();

    auto leaks(SEFUtility::HeapWatcher::get_heap_watcher().stop_watching());

    REQUIRE(leaks.open_allocations().size() == 2);
}

The capture_known_leak() method takes two arguments: 1) a std::list<std::string> containing one or more symbols which if located in a stack trace will cause the allocation associated with the trace to be ignored and 2) a function (or lambda) which will evoke one or more leaks associated with the symbols passed in the first argument. The leaking function need not be just adjacent to the malloc, it may be further up the call stack but the allocation will only be ignored if it appears at the same number of frames above the memory allocation as at the time the leak was captured.

This approach of actively capturing the leak at runtime is effective for dealing with ASLR (Address Space Layout Randomization) and does not require loading of shared libraries or other linking or loading gymnastics.

Pausing Allocation Tracking


The PauseThreadWatchGuard instance returned by a call to HeapWatcher::pause_watching_this_thread() is a scope based mechanism for suspending heap activity tracking in a thread. For example, the above snippet can be modified to not log the leak in OneLeakNested() by obtaining a guard and putting the leaking call into the same scope as the guard:

    SEFUtility::HeapWatcher::get_heap_watcher().start_watching();

    {
      auto pause_watching = SEFUtility::HeapWatcher::get_heap_watcher().pause_watching_this_thread();

      OneLeakNested();
    }

    auto leaks(SEFUtility::HeapWatcher::get_heap_watcher().stop_watching());

    REQUIRE(leaks.open_allocations().size() == 0);

Once the guard instance goes out of scope, HeapWatcher will again start tracking allocations in the thread.

Test Fixtures

Two test fixtures are included with HeapWatcher and both are intended to ease the creation of multi-threaded unit test cases, which are useful for detecting race conditions or dead locks. The test fixtures feature the ability to add functions or lambdas for ‘workload functions’ and then start all of those ‘workload functions’ simultaneously. Alternatively, ‘workload functions’ may be given a random start delay in seconds (as a double so it may be fractions of a second as well). This permits stress testing with a lot of load started at one time or allows for load to ramp over time.

The SEFUtility::HeapWatcher::ScopedMultithreadedTestFixture class starts watching the heap on creation and takes a function or lambda which will be called with a HeapSnapshot when all threads have completed, to permit testing the final heap state. This test fixture effectively hides the HeapWatcher instructions whereas the SEFUtility::HeapWatcher::MultithreadedTestFixture class requires the user to wrap the test fixture with the HeapWatcher start and stop.

Examples of both test fixtures appear below. First is an example of MultithreadedTestFixture :

    SECTION("Torture Test, One Leak", "[basic]")
    {
        constexpr int64_t num_operations = 2000000;
        constexpr int NUM_WORKERS = 20;

        SEFUtility::HeapWatcher::MultithreadedTestFixture test_fixture;

        SEFUtility::HeapWatcher::get_heap_watcher().start_watching();

        test_fixture.add_workload(NUM_WORKERS,
                                  std::bind(&RandomHeapOperations, num_operations));  //  NOLINT(modernize-avoid-bind)
        test_fixture.add_workload(1, &OneLeak);

        std::this_thread::sleep_for(10s);

        test_fixture.start_workload();
        test_fixture.wait_for_completion();

        auto leaks = SEFUtility::HeapWatcher::get_heap_watcher().stop_watching();

        REQUIRE(leaks.open_allocations().size() == 1);
    }

An example of ScopedMultiThreadedTestFixture follows :

    SECTION("Two Workloads, Few Threads, one Leak", "[basic]")
    {
        constexpr int NUM_WORKERS = 5;

        SEFUtility::HeapWatcher::ScopedMultithreadedTestFixture test_fixture(
            [](const SEFUtility::HeapWatcher::HeapSnapshot& snapshot) { REQUIRE(snapshot.numberof_leaks() == 5); });

        test_fixture.add_workload(NUM_WORKERS, &BuildBigMap);
        test_fixture.add_workload(NUM_WORKERS, &OneLeak);

        std::this_thread::sleep_for(1s);

        test_fixture.start_workload();
    }

Conclusion

HeapWatcher and the multithreaded test fixture classes are intended to help developers create tests which check for memory leaks either in simple procedural test cases written with GoogleTest or Catch2 or in more complex multi-threaded tests in those same base frameworks.

https://github.com/stephanfr/HeapWatcher

Managing Privileges for Automated Raspberry Pi GPIO Testing

Many RPi libraries manipulate the GPIO pins by mapping various GPIO control memory blocks into the process address space. For GPIO input/output pins only, the Raspberry Pi OS kernel supports the /dev/gpiomem device which can be accessed from user space. For any other GPIO functions, such as setting up a PWM output, other memory blocks must be accessed through /dev/mem.

Typically, the base address of the desired control block is mapped into the process address space using the mmap command which takes a file descriptor for a /dev/mem device. User space processes cannot open the /dev/mem device, so a common workaround is to run the process as root using sudo. Additionally, GPIO ISR handling typically has much higher fidelity when the ISR dispatching thread runs at one of the ‘real-time’ thread schedules. This too requires elevated privileges.

For many use cases, like automated CI/CD running a test process under sudo is a less than optimal approach and certainly violates the ‘Principle of Least Privilege’. Typically, these kinds of impediments result in skipping automated testing -or- using workarounds like putting root passwords in response files.

Bluntly, there is no way to provide elevated privileges to a process without incurring some security risk and for clarity the approach described below is not strictly *secure* – I feel it is better and more constrained than most of the alternatives I have found. Certainly for hobbyists and individuals working on a benchtop, this is probably more than ‘good enough’.

Background

The RPi maps the GPIO controls into physical memory at 0x3F000000 for BCM2835 (Models 2 &3) and 0x7E200000 for BCM2711 (Model 4) based RPis. To do this, a snippet of code like the following is used :

constexpr uint32_t MAPPED_MEMORY_PROTECTION = PROT_READ | PROT_WRITE | PROT_EXEC;
constexpr uint32_t MAPPED_MEMORY_FLAGS = MAP_SHARED | MAP_LOCKED;

uint32_t peripheral_base = 0xFE000000;
uint32_t gpio_offset = 0x200000;
uint32_t gpio_block_size = 0xF4;

int dev_mem_fd = open("/dev/mem", O_RDWR | O_SYNC );
void*  gpios = mmap( nullptr, gpio_block_size, MAPPED_MEMORY_PROTECTION, MAPPED_MEMORY_FLAGS, dev_mem_fd, peripheral_base + gpio_offset );

The /dev/mem device cannot be opened if the process does not have the CAP_SYS_RAWIO capability. There are a lot of other operations that are permitted for processes with that capability – but an ability to map physical memory into a virtual address space opens up a whole plethora of potential compromises.

Unfortunately, without this privilege any app needing to mount /dev/mem will have to be run with sudo – which is difficult to manage in an automated pipeline or even when running unit tests within an IDE, like using Catch2 in VSCode.

Workaround

Linux permits capabilities to be assigned to files, so it is possible to provide the CAP_SYS_RAWIO capability to specific files – for instance the unit test app created by a makefile. To do this, the following will suffice:

sudo setcap cap_sys_rawio tests

However, every time the tests file is rebuilt, the capabilities must be re-assigned – so we have not really made much progress, there is still a need for the user to intervene and provide a root password after every build.

To workaround this, the interactive user could grant herself the CAP_SETFCAP capability and then the snippet above can be run without requiring sudo. Giving a process CAP_SETFCAP capability is just one small step away from simply running as root, so we should strive for something better.

It is possible to permit a user or group to execute commands with sudo but without requiring a password by adding entries to a sudoers file. In fact, this capability can be fairly tightly constrained to very specific command patterns. These files can be placed under /etc/sudoers.d/ and will be picked up by the sudo processor. An example appears below :

steve ALL=(root) NOPASSWD:  /sbin/setcap cap_sys_rawio+eip /home/sef/dev/unit_tests/tests
steve ALL=(root) NOPASSWD: !/sbin/setcap cap_sys_rawio+eip /home/sef/dev/unit_tests/tests*..*
steve ALL=(root) NOPASSWD: !/sbin/setcap cap_sys_rawio+eip /home/sef/dev/unit_tests/tests*[ ]*

In the example, the first line permits user steve to run /sbin/setcap cap_sys_rawio+eip /home/sef/dev/unit_tests/tests without having to supply a password. The next two lines have an exclamation point to negate the operation and effectively eliminate any permutations of the prior line which could be used to grant the capability to a different, unintended file.

This combination puts us in a place where any process running under steve can provide the CAP_SYS_RAWIO capability to only the /home/sef/dev/unit_tests/tests file without having to supply the root password. Clearly, if steve‘s account is compromised it would possible for someone to gain root – but the attacker would have to do a lot more work to get there if privileges had been provided indiscriminately or if the root password were placed in a response file.

Doing the above gets us close, but there is one more step needed. The /dev/mem file is owned by root and can only be accessed by root. Assigning capabilities granularly elevates privileges in the interactive process but that process is still not root. To resolve this final stumbling block, we can modify the ACL for /dev/mem to permit the interactive user to access it. An example of how to do this appears below :

sudo setfacl -m u:steve:rw /dev/mem

This command will not persist through reboots, but needs to be executed only once after a reboot. It would be possible to make this assignment persistent if desired.

Putting It All Together

The good news is that all of the above can be *mostly* automated as part of a CMake specification. Only two infrequent manual steps are required.

The following example uses a pair of template files and some CMake specifications to create a specialized sudoers file and a specialized shell script for setting the ACLs properly for /dev/mem. Peronally, I put the templates in the misc subdirectory of my unit tests folder and the CMakeLists.txt file is in the unit test folder itself. For the purposes of this example, the templates must simply be in the subdirectory misc of the directory holding the CMakeLists.txt file.

#
#   Allow setcap execute without a password only for the CAP_SYS_RAWIO capability on
#       the tests file.  The negative patterns are intended to reduce the risk of anything 
#       other than just 'tests' being modified
#
#   Copy the generated file with the variables replaced into the /etc/sudoers.d directory
#

$ENV{USER} ALL=(root) NOPASSWD:  /sbin/setcap cap_sys_rawio+eip ${CMAKE_CURRENT_BINARY_DIR}/tests
$ENV{USER} ALL=(root) NOPASSWD: !/sbin/setcap cap_sys_rawio+eip ${CMAKE_CURRENT_BINARY_DIR}/tests*..*
$ENV{USER} ALL=(root) NOPASSWD: !/sbin/setcap cap_sys_rawio+eip ${CMAKE_CURRENT_BINARY_DIR}/tests*[ ]*

Adding the following to the CMakeLists.txt file will generate the final sudoers file. The CMake file command copies the generated file back into the source directory next to the template whilst also setting the file permissions appropriately. After the file is generated, it must be manually copied with the right file permissions to the /etc/sudoers.d/ directory, as that operation requires root privilege.

configure_file( ./misc/020_setcap_rawio_on_test_app.in ${CMAKE_CURRENT_BINARY_DIR}/misc/020_setcap_rawio_on_test_app )
file( COPY ${CMAKE_CURRENT_BINARY_DIR}/misc/020_setcap_rawio_on_test_app
      DESTINATION ${CMAKE_CURRENT_SOURCE_DIR}/misc
      FILE_PERMISSIONS OWNER_READ GROUP_READ WORLD_READ )

Finally, adding the following to the CMakeLists.txt file will assign the CAP_SYS_RAWIO capability to the tests file every time it is generated.

add_custom_command(TARGET tests POST_BUILD
                   COMMAND sudo setcap cap_sys_rawio+eip ${CMAKE_CURRENT_BINARY_DIR}/tests)

To make the ACL assignment easier, a similar process is used. First, a template file which will be processed by CMake is needed :

#!/bin/bash
sudo setfacl -m u:$ENV{USER}:rw /dev/mem

Then, the right magic in CMakeLists.txt to process the template file :

configure_file( ./misc/set_devmem_acl.in ${CMAKE_CURRENT_BINARY_DIR}/misc/set_devmem_acl.sh )
file( COPY ${CMAKE_CURRENT_BINARY_DIR}/misc/set_devmem_acl.sh
      DESTINATION ${CMAKE_CURRENT_SOURCE_DIR}/misc
      FILE_PERMISSIONS OWNER_READ OWNER_WRITE OWNER_EXECUTE GROUP_READ GROUP_EXECUTE WORLD_READ WORLD_EXECUTE )

This will create a shell script with the proper substitutions for the interactive user. This script needs to be executed once per session which seems a reasonable compromise. Alternatively, the sudoers file could be enriched to permit the command to be executed without a password and then even the process of permitting the interactive user access to /dev/mem can be used in automated scripts.

Adding CAP_SYS_NICE

As mentioned in the introduction, ISRs servicing GPIO interrupts will typically need to run with realtime scheduling for reasonable performance. The main risk that concerns me is *missing interrupts* and beyond a couple kilohertz on an RPi4 it is easy to lose interrupts. Realtime scheduling in Linux can be applied at the thread level using pthread_setschedparam in something like the following:

if (realtime_scheduling_)
{
    struct sched_param thread_scheduling;

    thread_scheduling.sched_priority = 5;

    int result = pthread_setschedparam(pthread_self(), SCHED_FIFO, &thread_scheduling);

    if( result != 0 )
    {
        SPDLOG_ERROR( "Unable to set ISR Thread scheduling policy - file may need cap_sys_nice capability.  Error: {}", result );
    }
}

Using realtime scheduling is something of a risk as poorly designed code can starve the rest of the system, or perhaps more frequently a user looking for more responsivity can ‘nice’ their processes to the detriment of other processes. Therefore, the CAP_SYS_NICE capability is required to execute the above snippet.

The templates above can be enriched to include CAP_SYS_NICE, but there are a few details that *really matter*. The nastiest little complication is the difference in how the comma (i.e. ‘,’) is used by the setcap command and how the comma is interpreted in a sudoers file. In both cases it is a separator, to separate multiple capabilities for setcap and to separate different commands in the sudoers file. Therefore, within the setcap command in the sudoers file, the comma must be escaped with a backslash. The following is the template from above including the CAP_SYS_NICE capability.

#
#   Allow setcap execute without a password only for the CAP_SYS_RAWIO and CAP_SYS_NICE capabilities
#       on on the tests file.  The negative patterns are intended to reduce the risk of anything
#       other than just 'tests' being modified
#
#   Copy the generated file with the variables replaced into the /etc/sudoers.d directory
#

$ENV{USER} ALL=(root) NOPASSWD:  /sbin/setcap cap_sys_rawio\,cap_sys_nice+eip ${CMAKE_CURRENT_BINARY_DIR}/tests
$ENV{USER} ALL=(root) NOPASSWD: !/sbin/setcap cap_sys_rawio\,cap_sys_nice+eip ${CMAKE_CURRENT_BINARY_DIR}/tests*..*
$ENV{USER} ALL=(root) NOPASSWD: !/sbin/setcap cap_sys_rawio\,cap_sys_nice+eip ${CMAKE_CURRENT_BINARY_DIR}/tests*[ ]*

As shown above, the comma separating capabilities is escaped with a backslash. Similarly, the setcap command used to assign capabilities to the tests file will have to be modified in the CMakeLists.txt specification.

Conclusion

Hopefully the content in this post will help not only manage permissions necessary for developing GPIO applications on the RPi but also provide some insight into how CMake can be used to generate various kinds of files from templates for specific use cases. I use these in my CMake files in VSCode while developing remotely on RPi 3s and 4s and it is certainly a lot more fluid a development experience than having to enter the root password all the time.

Cork – A High Performance Library for Geometric Boolean/CSG Operations

Gilbert Bernstein is currently a Ph.D. student at Stanford and had published some remarkable papers on computational geometry.  I was first drawn to his work by his 2009 paper on Fast, Exact, Linear Booleans as my interest in 3D printing led me to create some tooling of my own.  The various libraries I found online for performing Constructive Solid Geometry (CSG) operations were certainly good but overall, very slow.  CGAL is one library I had worked with and I found that the time required for operations on even moderately complex meshes was quite long.  CGAL’s numeric precision and stability is impeccable, the 3D CSG operations in CGAL are based on 3D Nef Polyhedra but I found myself waiting quite a while for results.

I exchanged a couple emails with Gilbert and he pointed me to a new library he had published, Cork.  One challenge with the models he used in his Fast, Exact paper is that the internal representation of 3D meshes was not all that compatible with other toolsets.  Though the boolean operations were fast, using those algorithms imposed a conversion overhead and eliminates the ability to take other algorithms developed on standard 3D mesh representations and use them directly on the internal data structures.  Cork is fast but uses a ‘standard’ internal representation of 3D triangulated meshes, a win-win proposition.

I’ve always been one to tinker with code, so I forked Gilbert’s code to play with it.  I spent a fair amount of time working with the code and I don’t believe I found any defects but I did find a few ways to tune it and bump up the performance.  I also took a swag at parallelizing sections of the code to further reduce wall clock time required for operation, though with limited success.  I believe the main problem I ran into is related to cache invalidation within the x86 CPU.  I managed to split several of the most computationally intensive sections into multiple thread of execution – but the performance almost always dropped as a result.  I am not completely finished working on threading the library, I may write a post later on what I believe I have seen and how to approach parallelizing algorithms like Cork on current generation CPUs.

Building the Library

My fork of Cork can be found here: https://github.com/stephanfr/Cork.  At present, it only builds on MS Windows with MS Visual Studio, I use VS 2013 Community Edition.  There are dependencies on the Boost C++ libraries, the MPIR library, and Intel’s Threading Building Blocks library.  There are multiple build targets, both for Win32 and x64 as well as for Float, Double and SSE arithmetic based builds.  In general, the Win32 Float-SSE builds will be the quickest but will occasionally fail due to numeric over or underflow.  The Double x64 builds are 10 to 20% slower but seem solid as a rock numerically.  An ‘Environment.props’ file exists at the top level of the project and contains a set of macros pointing to dependencies.

The library builds as a DLL.  The external interface is straightforward, the only header to include is ‘cork.h’, it will include a few other files.  In a later post I will discuss using the library in a bit more detail but a good example of how to use it may be found in the ‘RegressionTest.cpp’ file in the ‘RegressionTest’ project.  At present, the library can only read ‘OFF’ file formats.

There is no reason why the library should not build on Linux platforms, the dependencies are all cross platform and the code itself is pretty vanilla C++.  There may be some issues compiling the SSE code in gcc, but the non-SSE builds should be syntactically fine.

Sample Meshes

I have a collection of sample OFF file meshes in the Github repository:  https://github.com/stephanfr/SolidModelRepository.  The regression test program loads a directory and performs all four boolean operations between each pair of meshes in the directory – writing the result mesh to a separate directory.

These sample meshes range from small and simple to large and very complex.  For the 32bit builds, the library is limited to one million points and some of the samples when meshed together will exceed that limit.  Also, for Float precision builds, there will be numeric over or underflows whereas for x64 Double precision builds, all operations across all meshes should complete successfully.

When Errors (Inevitably) Occur

I have tried to catch error conditions and return those in result objects with some descriptive text, the library should not crash.  The code is very sensitive to non-manifold meshes.  The algorithms assume both meshes are two manifold.  Given the way the optimizations work, a mesh may be self intersecting but if the self intersection is in a part of the model that does not intersect with the second model, the operation may run to completion perfectly.  A different mesh my intersect spatially with the self intersection and trigger an error.

Meshes randomly chosen from the internet are (in my experience) typically not two manifold.  I spent a fair amount of time cleaning up the meshes in the sample repository.  If you have a couple meshes and they do not what to union – use a separate program like MeshLab to look over the meshes and double check that they are both in fact 2 manifold.

Conclusion

If you are interested in CSG and need a fast boolean library, give Cork a shot.  If you run into crashes or errors, let me know – the code looks stable for my datasets but they are far from exhaustive.

 

 

 

 

 

 

Metal Parts from 3D Prints

Introduction

Although 3D printing technology is advancing rapidly and home 3D printing is becoming both increasingly accessible and reliable, it will likely be a while before metal printing catches up with plastic Fused Filament Fabrication technology.  That said, there is a way to fabricate metal parts from some sets of 3D FFF designs.  In this blog post, I will describe a technique I have been using with some success to produce high quality metal parts from 3D prints.

Metal Clays and ‘Lost HIPS’ Molding

Metal clay is essentially a very low-tech approach to powder metallurgy.  Metal clays are a combination of atomized metal powder and organic, water soluble binders.  When soft, metal clay can be worked like a regular ceramic clay, dried to a hard yet brittle state and finally sintered in a conventional kiln to produce a solid metal piece.  The first metal clays were silver compounds but today metals such as bronze, brass, copper and steel are also readily available.

High Impact Polystyrene (HIPS) is a standard FFF filament, though definitely less popular than PLA or ABS.  My experience with HIPS as a general printing filament is quite good, it is easy to print with and can be printed very successfully on a Elmer’s glue coated, heated borosilicate glass plate.  An advantage of HIPS is that it is soluble in Limonene, a solvent derived from citrus fruit rinds.  As organic solvents go, limonene is about as safe as you can find, it is used medicinally for heartburn and GERD.  There is anecdotal evidence of it making a good margarita mixer…

Putting 3D printing with HIPS, metal clay, limonene as a solvent and finally kiln sintering together, we come up with the ‘Lost HIPS’ technique for creating metal parts from 3D prints.

Step 1 : Create a 3D Mold

The beauty and power of CAD/CAM is the ability to define, manipulate, visualize and refine 3D parts numerically prior to actually creating the physical part.  For our purposes, it is straightforward to take a 3D object definition in an STL file and create a mold for that part by performing a boolean difference operation between the 3D part and a rectangular cubiod (i.e. block).  For this post, I used the ‘Sun Medallion’ design I found on Thingiverse.  I then used OpenScad to create the cuboid and perform the binary difference to create a mold of the medallion.  I tweaked the mold along the way to strengthen the connections of the arms to the central solar disk but the design is still quite obviously that of Hank Dietz.

When printing a mold, try to find a good middle ground between a mold that is physically strong enough to work with but contains a minimum of HIPS material.  In ‘Lost HIPS’, all the mold material has to be dissolved by the limonene – so less is definitely more.

266

Sun Medallion mold printed in HIPS on a MakerGear M2 Printer.

HIPS is soluable in acetone as well, which means it can also be vapor polished in the same way as ABS.  I find vapor polishing to be helpful in smoothing the surface of the mold and sealing up any small holes or creases that may be left in the mold after printing – particularly when printing with thin layers.

Step 2: Fill the Mold with Metal Clay

At present, I am using FastFire Bronze Clay as it is relatively cheap (~$200/kg) and easy to work with though I have found it very sensitive to sintering temperatures.  I have also worked with PMC+ and it is easier to work with and very forgiving with respect to sintering temperatures but it is expensive (~$1500/kg).

When filling the mold, I have had the best luck painting the mold with water containing just a tiny amount of dish soap.  The water will cause the clay to form a thinner slurry next to the mold (much like ‘slip‘) and the detergent acts as a surfactant to help insure the slurry covers the entire base of the mold.  NB – do not use much water/detergent solution in the mold, as making the clay runny has lead to poor results for me.  I just paint the surface of the mold with a brush and that is it.  I usually put down a first layer of clay with an emphasis on insuring all the corners, nooks and crannies are filled and then fill the rest of the mold.  I use an old credit card to scrape off excess clay.

Once the mold is filled, I let it stand for a day to dry and sand the whole thing with a 200 grit sanding sponge to remove any excess clay.  It doesn’t take much sanding to get to a point where the finer features of the mold are visible again.  Finally, I wanted to make the sun medallion into necklaces for my daughters, so I added a loop to the back of the piece.  To make the loop, I used three pieces of HIPS together and placed a bit more clay over the HIPS and onto the back of the medallion.

279

Sun Medallion mold after drying, sanding to remove excess clay and the addition of the necklace loop.

Step 3: Lose the HIPS

Once the clay is dry, place the mold into a container of limonene and let the solvent do its work.  I use a glass container with a flourinated plastic lid that I found at Bed, Bath and Beyond (don’t forget your coupons).  Limonene is a solvent and will attack non flourinated plastics, though plastic gas cans and many consumer plastics are flourinated these days.  The more HIPS in the mold, the longer it will take to remove the material so expect anywhere from overnight to a couple days to get all the HIPS removed.  Fortunately, the metal clay does not appear to be nearly as sensitive to limonene as does the HIPS, so a couple days in a limonene bath does not appear to effect the clay.

Once the bulk of the HIPS is gone, I soak the piece in fresh limonene for a couple hours to get rid of the rest of the mold material and then dunk it in acetone for a minute or two.  The acetone serves two purposes.  First, it removes any gooey HIPS /limonene emulsion from the surface of the piece and second, it is a drying agent so after just a few minutes in air the piece is dry and can be worked a bit before sintering.

288

The bronze metal clay Sun Medallion after HIPS removal.  Note a bit of stringy HIPS material on the mesh holder and some HIPS left around the crease between the central disk and the sun arms.  This extra HIPS on the piece will burn off in the kiln.

Step 4: Make Repairs to the Clay Piece before Sintering

In its current state, the dried clay can be worked just as green clay can be worked.  I will typically sand off visible printing artifacts (i.e steps between layers), fill any voids with fresh clay and file off any excess material from the piece.  At this point as it is much easier to add/remove the clay material compared to post sintering.  I also find it helpful to use the water/detergent solution again to paint the surface a couple times to get a smoother finish.  The metal clay will absolutely reproduce every detail in the printed mold, so it you want a smoother aesthetic look – now is the time to take off the rough edges.

 Step 5: Burn out HIPS and Binder then Sinter

Once you are happy with the appearance of the piece, it is time to sinter.  I have a Paragon Caldera kiln which I love.  I did not get the digitally controlled version which I would suggest strongly for anyone looking to purchase a kiln.  I find the difference between a beautiful finished piece and an under-fired or over-fired piece to be just tens of degrees F.  Thus I end up having to watch my kiln closely as it finishes its ramp to insure it gets into the right temperature range and holds that range long enough to fully sinter the piece.

Pretty much any material other than silver needs to be fired in an anoxic (i.e oxygen free) environment.  For firing metal clays, someone far more clever than I figured out that one could easily create a locally oxygen free environment by burying the piece in carbon granules during firing.  This process works spectacularly well.  I will not go into the details here, there are plenty of references online.

I use a ceramic container for firing.  Firing in a stainless steel vessel leaves lots of black oxide in my kiln whereas the ceramic fiber pot leaves no residue whatsoever.  Having sintered with both, I also expect the pot to outlast a stainless vessel  as well.

I typically rest the piece on a piece of fiber kiln paper and then put the piece on the paper into the container filled with an inch or two of acid washed carbon granules.  I do not cover the piece but ramp my kiln to 400F and leave it there for an hour to burn off any remaining HIPS and the binder in the metal clay.

289

The cleaned up Sun Medallion on a piece of kiln paper.

296

The bronze clay Sun Medallion and kiln paper on a bed of carbon in the firing vessel.

After burning off any organic compounds left on the piece, I put another piece of kiln paper over the top of the piece and fill the container with carbon granules to within an inch of the top.  I put the lid on the container and ramp my kiln to 1450F and leave it there for an hour.  I then turn the kiln off and crack the lid to cool the piece quickly.

Step 6: Cleaning the Piece

It can take several hours for everything to cool to a point where it can be touched.  In particular, the vessel and carbon will hold heat well.

Once everything has cooled, I remove the piece from the carbon granules and clean it.  I use a Dremel tool with a wire brush to take the black scale off the surface of the piece and then use a bath of Picklean to remove the rest of the oxidation.  It may take a couple Picklean baths to really get the piece cleaned up but it is the only way I have found to get all the little details in the piece bright and shiny.

DSC_1059[1]

The finished product

Other Examples

I have created a number of other designs as well, below is a Tudor Rose extracted from a Thingiverse design.  What is interesting about this design is that there are regions in which upper layers overlap lower layers and if your printer does a decent job of bridging and you can force the metal clay into the mold, you can get a fairly intricate design which would be hard to fabricate using other means – like straight up stamping.

DSC_1061[1]

A Tudor Rose in Silver and Bronze.  In this example, the bronze piece has been slightly overfired and lost some of the detail of the original.  In contrast, the silver has retained much of the original detail to the point where the individual printing layers are clearly visible.

Conclusion

Though there are a number of steps involved with this process, for folks with more engineering talent than artistic talent this provides a way to create some gorgeous pieces simply by ‘turning the crank’.  After a few practice runs, I have found the ‘Lost HIPS’ process to be fairly straightforward.

Next I will probably try to fabricate some structural pieces using steel clay.  I have tinkered with steel clay once early on and I expect to have similar success with that material as well.

Writing a CGAL Mesh to an STL File

Introduction

The CGAL library for computational geometry is truly a work of art.  It focuses on precision and accuracy above all else and yet manages to stay very flexible through perhaps the most comprehensive use of C++ metaprogramming that I have ever encountered.  CGAL deals efficiently and elegantly with rounding errors in IEEE floating point operations by escalating from IEEE floating point to exact numeric computation when rounding errors may occur.  The library is the product of 15+ years of development by some of the best computational geometry developers on the planet.

Using CGAL

CGAL is not the most accessible library to just pick up and use.  The template based generic programming paradigm can be difficult to wrap your head around at first but there are just enough examples to jump-start HelloWorld style apps.  Beyond that, the data structures are optimized for computational geometry not for obviousness.  Traversing the data structures requires a bit of study and thought to accomplish a task.

One of the more powerful aspects of the CGAL library is the Delaunay 3D Mesh Generator.  This generator can take a variety of geometric elements, such as polyhedrons, and generate a 3D triangulated mesh.  This operation is key to 3D printing as it is that triangulated mesh which is then sliced to create the layer-by-layer extrusion paths.  The OpenSCAD 3D CAD Modeller (http://www.openscad.org/) uses CGAL for binary polyhedral operations and mesh generation.

Writing a CGAL Mesh in STL file Format

There are not a lot of persistence formats supported in the CGAL library itself.  For 3D printing, the primary file format is arguably the STL (Standard Tessellation Language) format.  An STL file contains a list of triangular faces defined by 3 vertices and a normal to the facet.

Though the STL format is straightforward, it took a bit of poking around and experimentation to figure out how to traverse the mesh and order the vertices to insure that the mesh is manifold.  Getting the various template arguments right was also an occasional issue.

The code below takes a CGAL  Mesh_complex_3_in_triangulation_3 instance, a list of subdomains within the mesh complex and an open stream.  The template function writes the listed subdomains to the stream.  Each subdomain is written as a distinct solid in the STL file. It compiles under VS2010 and newer g++ releases.


#ifndef __MESH_TO_STL_H__
#define __MESH_TO_STL_H__</code>

#include &lt;CGAL/bounding_box.h&gt;
#include &lt;CGAL/number_utils.h&gt;

#include #include

template
struct SubdomainRecord
{
SubdomainRecord( const SubdomainIndex index,
const std::string label )
: m_index( index ),
m_label( label )
{}

SubdomainIndex m_index;
std::string m_label;
};

template
class SubdomainList : public std::list&lt;SubdomainRecord&gt;
{};

//
// The TriangulationPointIterator and TriangulationPointList template classes
// ease the task of iterating over the points associated with vertices in the mesh.
//

template
class TriangulationPointIterator : public std::iterator&lt;std::forward_iterator_tag, typename Triangulation::Point&gt;
{
typename Triangulation::Finite_vertices_iterator m_currentLoc;

public:

TriangulationPointIterator()
{}

TriangulationPointIterator( typename Triangulation::Finite_vertices_iterator&amp; vertIterator )
: m_currentLoc( vertIterator )
{}

TriangulationPointIterator(const TriangulationPointIterator&amp; mit)
: m_currentLoc( mit.m_currentLoc )
{}

TriangulationPointIterator&amp; operator++() {++m_currentLoc;return *this;}
TriangulationPointIterator operator++(int) {TriangulationPointIterator tmp(*this); operator++(); return tmp;}
bool operator==(const TriangulationPointIterator&amp; rhs) { return( m_currentLoc == rhs.m_currentLoc ); }
bool operator!=(const TriangulationPointIterator&amp; rhs) { return( m_currentLoc != rhs.m_currentLoc ); }
typename Triangulation::Point&amp; operator*() {return( m_currentLoc-&gt;point() );}
};

template
class TriangulationPointList
{
const Triangulation m_triangulation;

public:

TriangulationPointList( const Triangulation&amp; triangulation )
: m_triangulation( triangulation )
{}

TriangulationPointIterator begin()
{
typename Triangulation::Finite_vertices_iterator beginningVertex = m_triangulation.finite_vertices_begin();

return( TriangulationPointIterator( beginningVertex ));
}

TriangulationPointIterator end()
{
typename Triangulation::Finite_vertices_iterator endingVertex = m_triangulation.finite_vertices_end();

return( TriangulationPointIterator( endingVertex ));
}

};

//
// This function writes the ASCII version of an STL file. Writing the binary version should be
// a straightforward modification of this code.
//

template
std::ostream&amp;
output_boundary_of_c3t3_to_stl( const C3T3&amp; c3t3,
const SubdomainList&amp; subdomainsToWrite,
std::ostream&amp; outputStream )
{
typedef typename C3T3::Triangulation Triangulation;
typedef typename Triangulation::Vertex_handle VertexHandle;

// This is an ugly path to the Kernel type but this works and is all compile time anyway

typedef typename Triangulation::Geom_traits::Compute_squared_radius_3::To_exact::Source_kernel Kernel;

// Get the bounding box for the mesh so we can offset it into the all positive quadrant

TriangulationPointList pointList( c3t3.triangulation() );

typename Kernel::Iso_cuboid_3 boundingBox = CGAL::bounding_box( pointList.begin(), pointList.end() );

typename Kernel::Vector_3 offset( 1 - boundingBox.xmin(), 1 - boundingBox.ymin(), 1 - boundingBox.zmin() );

// Iterate over the facets in the mesh

std::array&lt;VertexHandle,3&gt; vertices;

for( SubdomainList::const_iterator itrSubdomain = subdomainsToWrite.begin(); itrSubdomain != subdomainsToWrite.end(); itrSubdomain++ )
{
// Write the solid prologue to the stream

outputStream &lt;&lt; "solid " &lt;&lt; itrSubdomain-&gt;m_label &lt;&lt; std::endl;
outputStream &lt;&lt; std::scientific; for( typename C3T3::Facets_in_complex_iterator itrFacet = c3t3.facets_in_complex_begin(), end = c3t3.facets_in_complex_end(); itrFacet != end; ++itrFacet) { // Get the subdomain index for the cell and the opposite cell typename C3T3::Subdomain_index cell_sd = c3t3.subdomain_index( itrFacet-&gt;first );
typename C3T3::Subdomain_index opp_sd = c3t3.subdomain_index( itrFacet-&gt;first-&gt;neighbor( itrFacet-&gt;second ));

// Both cells must be in the subdomain we are writing

if(( cell_sd != itrSubdomain-&gt;m_index ) &amp;&amp; ( opp_sd != itrSubdomain-&gt;m_index ))
{
continue;
}

// Get the vertices of the facet

for( int j=0, i = 0; i &lt; 4; ++i ) { if( i != itrFacet-&gt;second )
{
vertices[j++] = (*itrFacet).first-&gt;vertex(i);
}
}

// If the facet is not oriented properly, swap the first two vertices to flip it

if(( cell_sd == itrSubdomain-&gt;m_index ) != ( itrFacet-&gt;second%2 == 1 ))
{
std::swap( vertices[0], vertices[1] );
}

// Get the unit normal so we can write it

const typename Kernel::Vector_3 unit_normal = CGAL::unit_normal( vertices[0]-&gt;point(), vertices[1]-&gt;point(), vertices[2]-&gt;point() );

// Write the facet record to the file

outputStream &lt;&lt; "facet normal " &lt;&lt; unit_normal &lt;&lt; std::endl;
outputStream &lt;&lt; "outer loop" &lt;&lt; std::endl;
outputStream &lt;&lt; "vertex " &lt;&lt; vertices[0]-&gt;point() + offset &lt;&lt; std::endl;
outputStream &lt;&lt; "vertex " &lt;&lt; vertices[1]-&gt;point() + offset &lt;&lt; std::endl;
outputStream &lt;&lt; "vertex " &lt;&lt; vertices[2]-&gt;point() + offset &lt;&lt; std::endl;
outputStream &lt;&lt; "endloop" &lt;&lt; std::endl;
outputStream &lt;&lt; "endfacet" &lt;&lt; std::endl &lt;&lt; std::endl;
}

// Write the epilog for the solid

outputStream &lt;&lt; "endsolid " &lt;&lt; itrSubdomain-&gt;m_label &lt;&lt; std::endl;
}

// Return the stream and we are done

return( outputStream );
}

#endif // __MESH_TO_STL_H__

The code above includes a pair of helper template classes to ease iterating over mesh points for determining the bounding box for the mesh. The STL format requires that all points be positive but it doesn’t care about units.

The following code snippet demonstrates how to call the template function. The template parameter is inferred from the function arguments.

 SubdomainList<C3t3::Subdomain_index> subdomainsToWrite;

 subdomainsToWrite.push_back( SubdomainRecord<C3t3::Subdomain_index>( 0, std::string( "elephant" ) ));


 std::ofstream outputStream( "elephant.stl" );

 output_boundary_of_c3t3_to_stl( c3t3, subdomainsToWrite, outputStream );

 outputStream.close();

Conclusion

In later posts, I will follow up with CGAL examples of using Nef Polyhedra and performing the kinds of binary operations needed for CSG (Constructive Solid Geometry) applications.  Having the facility to mesh the polyhedra and persist the mesh in a file format that can then be consumed by a slicer and printed is a valuable stepping stone.

Building GCC Plugins – Part 3 C++ Libraries

As discussed in the prior post, I have started a set of C++ libraries to reduce the complexity of writing GCC Plugins and interpreting the GCC Abstract Syntax Tree.  In this post I will provide a high level description of the libraries and walk through the dependencies and directory structures.  The libraries are available on Github: ‘stephanfr/GCCPlugin’.

NB – At the time of writing, I am going through successive revisions and refactoring passes on the library, so expect anything you build now to break with my next commit to GitHub.  The interfaces will settle down in time and I will ‘chill’ them at some point in hopefully the not too distant future.

Licensing and Dependencies

All of the libraries with the exception of the unit test library link directly with the GCC source code, therefore they are all licensed with GPL V3.0.  The libraries are built with the C++11 language features and have dependencies on the Standard Library shipped with GCC and Boost libraries.  The unit testing framework depends on the Google Test libraries.

Programming Style

For what it is worth, I’ve been writing C++ code for a long, long time and am somewhat opinionated regarding some development practices.  First, I use anything in the standard c++ library – in particular I do not write containers.  Second, I use the std::string class in preference of char* strings almost exclusively.  For external interfaces I may expose a char* type but under the interface any char* will almost always map straight back to a std::string instance.  Third, I use anything from the Boost library that suits my needs.  The Boost libraries are excellent, don’t waste your time re-inventing a component in that library; in all likelihood your component will not be as good anyway.  Fourth, there are some naked pointers in these libraries but in general I try to use a std::unique_ptr or std::shared_ptr in any code written today (I will fix any naked pointers in this library as I refactor).  The standard library smart pointers are a bit more difficult to use than naked pointers, but that difficulty is a result of them enforcing the semantics necessary to know when to delete a pointer they wrap.  Finally, I really like C++ 11 – I’d strongly suggest cutting over to it.

With regard to my coding format, it is idiosyncratic.  Indentation and spacing don’t quite adhere to any standard, but at least I no longer use Hungarian notation – though that was a hard habit to kick.

Project and Directory Structure

The project is currently composed of seven directories, each with a single Eclipse CDT C++ project:

  1. CPPLanguageModel – a compiler-neutral class library of C++ language elements
  2. GCCInternalsTools – a set of classes and functions tailored specifically to the GCC g++ compiler to build a CPPLanguageModel representation of the code being compiled and to enable insertion of new code into the AST
  3. GCCInternalsUTFixture – a test fixture providing an abstraction of the GCCInternalsTools designed to permit the creation of unit tests for the library without any dependency on the GCC specific libraries themselves
  4. GCCInternalsUnitTest – a set of unit tests for key features of the GCC Plugins libraries
  5. TestExtensions – a collection of test framework ‘plugins’ that rely on GCCInternalsTools and the GCC headers; a separate project is used to prevent dependencies on GCC internals to leak into the main Unit Test framework
  6. GCCPlugin – a ‘HelloWorld’ style plugin for GCC Plugins using this framework
  7. Utility – Various utility classes to simply coding and implement design patterns I like

The most up-to-date examples of using the libraries will be in the unit test projects.  Similarly, if you go wandering through the code you will frequently see blocks of code commented out.  I tend to leave code I have refactored in place for a revision or two just in case a bug crawls out.  I find it is a bit easier than going back through prior revisions in source code control but it can make the code a little messy at points.  When I get to a version I am happy with, I go through a couple cleaning passes and knock out dead or legacy code.

Design Philosophy

The innards of GCC are absolutely not for the faint of heart.  A primary design goal of this framework is to insulate someone wanting to produce a GCC Plugin from the complexity of the compiler and its design paradigms.  At present, only a single GCC header file is required to build a plugin with this framework and all functionality exposed through the framework’s API is abstracted from GCC itself.  The framework is built for manipulating the Abstract Syntax Tree for C++ language programs but could be modified to match other languages.

To use the framework, you ought to only include header files from the CPPLanguageModel project.  Actually, the ASTDictionary.h and PluginManager.h header files will pull in most of the declarations needed to build your plugin.  Two header files from the gcc distribution are also needed: config.h and gcc-plugin.h

The object model exposed by the framework is that of a Dictionary of all the types and declarations in the code being compiled by g++ with the plugin loaded.  The dictionary is indexed by namespace, entry fully qualified name, entry source code location, entry UID and and an identity field.  All of the indices are exposed by the ASTDictionary class and can be used for searching the dictionary for a specific entry.  The identity, UID and fully qualified name indices are unique whereas the namespace and source location indices are non unique and may return a range of results.

The dictionary contains entries for different types and declarations.  Entries will be one of the following ‘kinds’: CLASS, UNION, FUNCTION, GLOBAL_VAR, TEMPLATE or UNRECOGNIZED.  The UNRECOGNIZED kind is simply a catch-all for any AST tree elements that have not yet been added to the tree parser.  Dictionary entries are effectively stubs from which the actual definition of the entry may be extracted.  Definitions contain the detailed, ‘kind’ specific information about the entry.  For example, the ClassDefinition object contains the base classes, fields, methods, template methods and friends for the class type.  Source location, namespace, UID, static and extern flags and a list of attributes are available for all dictionary entries and those values are copied into the more detailed definitions as well.

I’ve tried to insure that the AST tree parser will pass through the tree adding dictionary entries for elements it recognizes and ignoring everything else.  My intent is that it should not crash on encountering some language element it does not recognize in the AST but I have not run the parser over a whole lot of code so I will stick to ‘intent’ for now.  At present, the parser recognizes unions but does not yet provide a detailed definition of union types.  I figured it was more valuable to get some code injection functionality in place before sweating through the details of union representations in the GCC AST.

Current Supported Versions of GCC

The internals of GCC are constantly in flux and functionally there are no ‘frozen’ APIs or data structures that one can depend upon remaining static release over release.  The changes are unlikely to be significant release over release but there is a high probability of breaking changes associated with any release.

The code currently compiles and runs with GCC 4.8.0.  I can make no guarantees that it will compile and run with later releases, though hopefully nothing should break between double dot releases.

Example Plugin

An example ‘HelloWorld’ plugin appears below.  The four header files appear at the top.  The plugin_is_GPL_compatible symbol is needed for licensing compliance with the GCC suite.

There exists an implementation of the CPPModel::CallbackIfx interface which is used by the framework to call back into the plugin at specific times in the compilation process.  There are entry points for when the AST is ready, for a point at which namespaces may be declared and a point at which code may be injected.  For the sample plugin, all that happens is that the contents of the TestNamespace inside the code being compiled is dumped to cerr.  The plugin_init function is part of the GCC plugin framework and is rather straightforward when using these abstraction libraries.


/*-------------------------------------------------------------------------------
Copyright (c) 2013 Stephan Friedl.

All rights reserved. This program and the accompanying materials
are made available under the terms of the GNU Public License v3.0
which accompanies this distribution, and is available at
http://www.gnu.org/licenses/gpl.html

Contributors:
 Stephan Friedl
-------------------------------------------------------------------------------*/

#include "config.h"

#include "ASTDictionary.h"
#include "PluginManager.h"

#include "gcc-plugin.h"

int plugin_is_GPL_compatible;

class Callbacks : public CPPModel::CallbackIfx
{
public :

 Callbacks()
 {}

 virtual ~Callbacks()
 {}

 void ASTReady()
 {
 std::list<std::string> namespacesToDump( { "TestNamespace::" } );

 CPPModel::GetPluginManager().GetASTDictionary().DumpASTXMLByNamespaces( std::cerr, namespacesToDump );
 };

 void CreateNamespaces()
 {
 };

 void InjectCode()
 {
 };

};

Callbacks g_pluginCallbacks;

int plugin_init( plugin_name_args* info, plugin_gcc_version* ver )
{
 std::cerr << "Starting Plugin: "<< info->base_name << std::endl;

 CPPModel::GetPluginManager().Initialize( "HelloWorld Plugin", &g_pluginCallbacks );

 return( 0 );
}

 

Example Output

A sample program to be compiled appears below.  This code has the TestNamespace declared and it is the contents of that namespace that will be dumped by the plugin above.

#include <iostream>

namespace TestNamespace
{
	class TestClass
	{
	public :

		int			publicInt;

		int			getPublicInt() const
		{
			return( publicInt );
		}

	protected :

		double		getPrivateDouble() const
		{
			return( privateDouble );
		}

	private :

		double		privateDouble;
	};

	char*		globalString = "This is a global string";

	TestClass	globalTestClassInstance;
}

int main()
{
	std::cout << "!!!Hello World!!!" << std::endl; // prints !!!Hello World!!!

	return 0;
}

The command line required to invoke g++ with the plugin and compile the above file follows:

/usr/gcc-4.8.0/bin/gcc-4.8.0 -c -std=c++11 -fplugin=libGCCPlugin.so HelloWorld.cpp

When g++ initializes, it loads the sample plugin and when the AST is ready, the plugin dumps the following to the standard output.  It isn’t prefect XML but ought to be good enough to analyze the program being compiled.

8: 2014-09-23 21:03:42   [LoggingInitialization] [NORMAL]  Logging Initiated
Starting Plugin: libGCCPlugin
HelloWorld.cpp:34:24: warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings]
  char*  globalString = "This is a global string";
                        ^
<ast>
    <dictionary>
        <namespace name="TestNamespace::">
            <dictionary_entry>
                <namespace>
                    <name>TestNamespace::</name>
                </namespace>
                <name>TestClass</name>
                <uid>20720</uid>
                <source-info>
                    <file>HelloWorld.cpp</file>
                    <line>9</line>
                    <char-count>1</char-count>
                    <location>6451683</location>
                </source-info>
            </dictionary_entry>
            <dictionary_entry>
                <namespace>
                    <name>TestNamespace::</name>
                </namespace>
                <name>globalString</name>
                <uid>28506</uid>
                <source-info>
                    <file>HelloWorld.cpp</file>
                    <line>34</line>
                    <char-count>1</char-count>
                    <location>6454884</location>
                </source-info>
                <static>true</static>
            </dictionary_entry>
            <dictionary_entry>
                <namespace>
                    <name>TestNamespace::</name>
                </namespace>
                <name>globalTestClassInstance</name>
                <uid>28507</uid>
                <source-info>
                    <file>HelloWorld.cpp</file>
                    <line>36</line>
                    <char-count>1</char-count>
                    <location>6455143</location>
                </source-info>
                <static>true</static>
            </dictionary_entry>
        </namespace>
    </dictionary>
    <elements>
        <namespace name="TestNamespace::">
            <class type="class">
                <name>TestClass</name>
                <uid>20720</uid>
                <source-info>
                    <file>HelloWorld.cpp</file>
                    <line>9</line>
                    <char-count>1</char-count>
                    <location>6451683</location>
                </source-info>
                <namespace>
                    <name>TestNamespace::</name>
                </namespace>
                <compiler_specific>
                    </artificial>
                </compiler_specific>
                <base-classes>
                </base-classes>
                <friends>
                </friends>
                <fields>
                    <field>
                        <name>publicInt</name>
                        <source-info>
                            <file>HelloWorld.cpp</file>
                            <line>13</line>
                            <char-count>1</char-count>
                            <location>6452196</location>
                        </source-info>
                        <type>
                            <kind>fundamental</kind>
                            <declaration>int</declaration>
                        </type>
                        <access>PUBLIC</access>
                        <static>false</static>
                        <offset_info>
                            <size>4</size>
                            <alignment>4</alignment>
                            <offset>0</offset>
                            <bit_offset_alignment>128</bit_offset_alignment>
                            <bit_offset>0</bit_offset>
                        </offset_info>
                    </field>
                    <field>
                        <name>privateDouble</name>
                        <source-info>
                            <file>HelloWorld.cpp</file>
                            <line>30</line>
                            <char-count>1</char-count>
                            <location>6454374</location>
                        </source-info>
                        <type>
                            <kind>fundamental</kind>
                            <declaration>double</declaration>
                        </type>
                        <access>PRIVATE</access>
                        <static>false</static>
                        <offset_info>
                            <size>8</size>
                            <alignment>8</alignment>
                            <offset>0</offset>
                            <bit_offset_alignment>128</bit_offset_alignment>
                            <bit_offset>64</bit_offset>
                        </offset_info>
                    </field>
                </fields>
                <methods>
                    <method>
                        <name>getPublicInt</name>
                        <uid>28497</uid>
                        <source-info>
                            <file>HelloWorld.cpp</file>
                            <line>15</line>
                            <char-count>1</char-count>
                            <location>6452452</location>
                        </source-info>
                        <access>PUBLIC</access>
                        <static>false</static>
                        <result>
                            <type>
                                <kind>fundamental</kind>
                                <declaration>int</declaration>
                            </type>
                        </result>
                        <parameters>
                            <parameter>
                                <name>this</name>
                                <type>
                                    <kind>derived</kind>
                                    <declaration>
                                        <operator>pointer</operator>
                                        <type>
                                            <kind>class-or-struct</kind>
                                            <declaration>TestNamespace::TestClass</declaration>
                                            <namespace>
                                                <name>TestNamespace::</name>
                                            </namespace>
                                        </type>
                                    </declaration>
                                </type>
                                <compiler_specific>
                                    </artificial>
                                </compiler_specific>
                            </parameter>
                        </parameters>
                    </method>
                    <method>
                        <name>getPrivateDouble</name>
                        <uid>28499</uid>
                        <source-info>
                            <file>HelloWorld.cpp</file>
                            <line>22</line>
                            <char-count>1</char-count>
                            <location>6453350</location>
                        </source-info>
                        <access>PROTECTED</access>
                        <static>false</static>
                        <result>
                            <type>
                                <kind>fundamental</kind>
                                <declaration>double</declaration>
                            </type>
                        </result>
                        <parameters>
                            <parameter>
                                <name>this</name>
                                <type>
                                    <kind>derived</kind>
                                    <declaration>
                                        <operator>pointer</operator>
                                        <type>
                                            <kind>class-or-struct</kind>
                                            <declaration>TestNamespace::TestClass</declaration>
                                            <namespace>
                                                <name>TestNamespace::</name>
                                            </namespace>
                                        </type>
                                    </declaration>
                                </type>
                                <compiler_specific>
                                    </artificial>
                                </compiler_specific>
                            </parameter>
                        </parameters>
                    </method>
                </methods>
                <template_methods>
                </template_methods>
            </class>
            <global_var_entry>
                <namespace>
                    <name>TestNamespace::</name>
                </namespace>
                <name>globalString</name>
                <uid>28506</uid>
                <source-info>
                    <file>HelloWorld.cpp</file>
                    <line>34</line>
                    <char-count>1</char-count>
                    <location>6454884</location>
                </source-info>
                <static>true</static>
                <type>
                    <kind>derived</kind>
                    <declaration>
                        <operator>pointer</operator>
                        <type>
                            <kind>fundamental</kind>
                            <declaration>char</declaration>
                        </type>
                    </declaration>
                </type>
            </global_var_entry>
            <global_var_entry>
                <namespace>
                    <name>TestNamespace::</name>
                </namespace>
                <name>globalTestClassInstance</name>
                <uid>28507</uid>
                <source-info>
                    <file>HelloWorld.cpp</file>
                    <line>36</line>
                    <char-count>1</char-count>
                    <location>6455143</location>
                </source-info>
                <static>true</static>
                <type>
                    <kind>class-or-struct</kind>
                    <declaration>TestNamespace::TestClass</declaration>
                    <namespace>
                        <name>TestNamespace::</name>
                    </namespace>
                </type>
            </global_var_entry>
        </namespace>
    </elements>
</ast>
Declaring Globals

Conclusion

It has taken a while to get this far but I will dive into the internals of the framework and provide examples of code injection in future posts.

 

Printing with Taulman Bridge Nylon

I have been dabbling in 3D printing for the last six months with my MakerGear M2 printer (http://www.makergear.com a fantastic precision machine tool) and have done a lot of printing in PLA.  PLA is great for a lot of objects, particularly the various elephants and other things I print for my kids but is too brittle for some types of applications.  I have a couple projects I am contemplating that require a tougher material, so I gave Taulman Bridge Nylon a try.  Like all things in 3D printing, there is a learning curve but once you have a process for printing with this nylon, it is a fantastic material.

Taulman Bridge

The ‘Bridge’ nylon is intended to combine the toughness of nylon with the printing ease of PLA.  I have not tried printing with regular nylon so I cannot comment on how much easier it is to print with Bridge, but printing with Bridge is not quite like printing with PLA.  Bridge still absorbs water, it requires a higher print temperature and at least for me, I need to print with thicker layers at a slower speed with Bridge than with PLA.  That said, with the right adjustments in place my success printing with Bridge is pretty darn close to my success rate with PLA – probably 90%+ prints I start complete acceptably.

Using Bridge

First off, this is the process I use.  I live in Colorado at about 5000ft elevation and relatively low humidity.  Your mileage may vary…

Step 1: Dry the Material

Bridge comes on a small diameter spool sealed in a bag with silica gel drying packets inside the bag.  Despite the Bridge formulation to reduce wetness, the packaging and the relative dryness of Colorado, I had little success printing with Bridge right out of the bag.  Right out of the bag, you will see a lot of steam coming out of the nozzle and I got intermittent sputtering as well.  I tried baking the spool in the oven for 6 hours at 175F and this seemed to work reasonably well though the spool deformed a bit.  Also, I got the sense that the inner layers of the spool may not have dried as well as the outer layers.

I adjusted my drying technique a bit by getting a small toaster oven from Walmart and then using it to bake just enough loose filament pulled off the spool for a given print:

Baking Taulman Bridge Nylon

Preparing to dry a length of Taulman Bridge nylon in a small toaster oven in my garage. I typically pull enough material from the main spool for a specific print, clip it and then bake it loose in the oven at 175F for 6 to 8 hours.

After baking, I let the material cool in the oven for a bit and then I transfer it immediately into a zip-lock bag with a couple packs of silica gel for further cooling and drying overnight.

Step 2:  Printer Settings

I use the Simplify3D (http://www.simplify3d.com) software package to slice my models and control the M2.  I’ve used a number of the more popular open source packages as well but I really like all the key functions in one place with an easy to navigate GUI.  I took the suggested settings for the M2 and nylon and through trial and error made some adjustments from that starting point.  The primary problem I ran into was ‘popcorn’ from the print instead of a smooth stream of nylon.  After that I also had some adhesion and warping issues.  The four main changes I made were to bump up the extruder temperature to 245C, start the build plate at 70C then ramp to 90C for the second layer on, cut the printing speed in half and finally stick to a 0.3mm layer height.

Below are the config screens from Simplify3D with the settings I use for Bridge:

Taulman-Extruder

Extruder settings for printing with Taulman Bridge on my MakerGear M2. Note the Ooze Control settings.

Taulman-Layer

Layer settings

Taulman-PrimaryExtruder

Extrusion temperature set at 245C

Taulman-HeatedBed

Heated bed at 70C for the first layer and then 90C for the rest.

Taulman-Other

I reduced the default printing speed by 50% from the speed used for PLA or suggested for nylon,

Taulman-Advanced

Retraction control and ooze rate.

Step 3: Preparing the Build Plate

A couple test prints with clean glass had adhesion and/or warping problems.  I gave the Elmer’s glue coating I use for PLA a shot and it worked extraordinarily well.  So well in fact that I have to use a very thin layer of glue, as with thicker layers the nylon is very, very hard to detach.  Below are a couple picture of the wet glue on the plate and what it looked like dry just before printing.

Build Plate with Fresh Elmer's Glue

Build plate with a thin coating of Elmer’s white glue to improve nylon adhesion.

Build Plate with Dried Elmer's Glue

Plate with the dried glue at 70C.

Finished Product:

For an example of what is possible with Bridge on the M2, I printed Emmett’s Gear Bearing from Thingiverse (http://www.thingiverse.com/thing:53451).  This is an absolutely ingenious design of a bearing that can only be produced with 3D printing.  If you wanted to use this in a real project, then nylon would be a far better material than PLA or ABS.  If you look at the design, it is pretty clear that if your printer or material aren’t dialed in well your odds of getting a working bearing are slim.  There are lots of opposing surfaces which could fuse and render the bearing a hockey puck.  Emmett’s designs on Thingiverse are exceptional, if you have not looked them over then do yourself a favor and do so.

If the piece will not come loose easily from the build plate, I usually put the plate in the freezer for 5 or 10 minutes after which the piece generally pops right off.  Check the start of the video for that demo.

Completed Bearing in Nylon

Completed Gear Bearing by Emmett printed on a MakerGear M2 with Taulman Bridge nylon.

Demo Video:

 

Final Thoughts :

My experience with the Bridge material has been great, once I got the process right.  Dry material, higher temperature, slow printing, thick layers and glue for adhesion all seem to matter but the results as demonstrated by the pictures and video are pretty self-evident.