![]() |
ULIS
dev4.0.7
Utility Library for Imaging Systems
|
An overview of the various elements involved in setting up an Image Processing feature for ULIS.
We will take the Clear feature as an exemple to dissect the inner mechanisms used to setup a feature to work in the async pipeline with its various requirements. We will start from the top-level public API and slowly dive into the private implementation parts.
The first step is to implement what we call an entry-point in the FContext class, that is the primary symbol exposed to the user. It is a public method of the FContext class, and the method signature follows a simple pattern. Before implementing an entry-point we should ask ourselves a few questions:
For a Clear operation, we need only one block, that will be processed in-place, and we are allowed to process a sub-rectangle of the geometry if we want to, so we should also pass a rectangle as an argument. We don't need any other arguments in order to complete the task ( Clear is one of the simplest functions available ). The method definition would look like this:
"include/Context/Context.h"
The signature shows that we expect a block passed by reference. We privilege reference over pointer here because it is "safer" in the sense that it makes no sense to be able to pass a nullptr to the operation. The reference is non-const which indicates the input block can be modified. The signature also shows that we can pass an optional rectangle geometry to indicate what portion of the block should be cleared. If the argument is missing, it will take the largest possible geometry by default, even if it is bigger than the block geometry; these geometry will have to be sanitized in the method implementation. Then we see a bunch of arguments related to asynchronous mechanisms, that are common to ALL entry-points in FContext:
Now for a given feature entry-point in FContext, we associate a symetrical cached function pointer in FContextualDispatchTable. This function pointer is initialized at runtime, during construction of a FContextualDispatchTable object and will be used by the implementation of the Clear method in FContext:
"source/Context/ContextualDispatchTable.h"
"source/Context/ContextualDispatchTable.cpp"
As we can see, the FContextualDispatchTable now has a "mScheduleClear" member of type "fpCommandScheduler", which is the common type for all features cached entry. The "mScheduleClear" member is const so it has to be initialized in the constructor initializer list, and it is initialized from an unusual looking statement:
This statement has a lot happening inside, and we'll go into it, but for now just know "TDispatcher" is a template mechanism functor to query and retrieve the dispatched implementation according to runtime arguments such as the hardware metrics or the format; and "FDispatchedClearInvocationSchedulerSelector" is an automatically generated structure during preprocessing that has all the information needed for the "TDispatcher" to work.
Let's go back to FContext and take a look at the actual implementation of our Clear entry point:
"source/Context/Commands/Context.Clear.cpp"
First we retrieve the same familiar function signature, now let us break down the various steps involved here:
The first "geometry sanitize" phase is pretty much straight forward, we fetch the block geometry and intersect it with the input geometry, that way if the input geometry is bigger than the block, it will be clamped, and if there is no intersection at all, the resulting intersection will be an invalid rectangle that we can check.
The event safety phase checks if the obtained rectangle is valid, and if it is not, it will return because no work is needed ( a no-op ), while notifying the input iEvent that it doesn't need to wait for anything, basically completing the task immediately.
Finally we build the command, and this is a more serious part of the implementation, because many things happen at the same time:
What happens inside an FCommand or what happends inside the FCommandQueue is not relevant for the purpose of implementing a new feature, so let's focus on the feature. What is interesting to notice here is that we used the mScheduleClear member from the contextual dispatch table, and that we mention a FSimpleBufferCommandArgs object, so let us have a look at that.
FSimpleBufferCommandArgs, along with FSimpleBufferJobArgs, are commonly used command arguments for operations that work in-place and don't require additional arguments.
"source/Scheduling/SimpleBufferArgs.h"
Some features will require you to implement a custom Args object for this command, but most will be simple enough to only require FSimpleBufferCommandArgs, or its counterpart FDualBufferCommandArgs for operations that don't work in-place.
Now everything in the top-level API is setup, but we still haven't implemented anything in terms of actual feature work yet, and we still need to understand about "FDispatchedClearInvocationSchedulerSelector". So let's go to "source/Clear/" and see what we have here. We only see two files, Clear.h and Clear.cpp, which are pretty short, and will contain everything we need to implement the actual feature.
The "Clear.h" file is 34 lines short and all we need to look at is this:
"source/Clear/Clear.h"
Lower-level implementations rely on recurring macros to avoid redundant work, and while macros are often frowned upon for messing with the reading flow, they work here in a language extension paradigm, allowing to automatically generate complex structures automatically during preprocessing, you just need to be familiar with their interface. First, whe have three "ULIS_DECLARE_COMMAND_SCHEDULER", declaring three low-level entries for ScheduleClear. The "mScheduleClear" member in FContextualDispatchTable will point to one of these versions ( AVX, SSE or MEM ). Their purpose is to be called immediately when building an FCommand object, passed as an fpCommandScheduler function pointer, and their role is to build a bunch of jobs according to the policy and the chuncks constraints of the invocations.
Now we can finally have a look a the Dispatcher mechanism, and at "FDispatchedClearInvocationSchedulerSelector". With a single macro "ULIS_DEFINE_DISPATCHER_GENERIC_GROUP", we define a SchedulerSelector structure, that TDispatcher will be able to read and query informations from. After defining its name, we pass in the command schedulers we defined earlier in order, the AVX, SSE, and MEM version. This allows us to implement various levels of optimization for a given feature, and the dispatcher will query the most appropriate one based on the hardware metrics at runtime. Alternatively, when building on a platform that doesn't support SIMD, these optimizations branches will be thrown away and default to the MEM version. MEM indicates no SIMD optimisations are used.
Now for the actual bulk of the work, let's switch to Clear.cpp. Let's first have a look at the end of the file:
"source/Clear/Clear.cpp"
This is symmetrical from the header file, we just provide automatic implementation for the various ScheduleClear methods, indicate which kind of Arguments are expected, here FSimpleBufferCommandArgs as we've seen earlier, as well as FSimpleBufferJobArgs which will be expected in the invocations. Finally, we indicate which invocations these will actually schedule within jobs. Then we have the option to specify format specializations for the dispatcher and selector ( for an example with specializations for RGBA8, see Blend ), but here we don't need any.
Now for the most interesting piece where the work actually occurs, let's have a look at the Invocations:
"source/Clear/Clear.cpp"
Here we can do actual work on the image buffer, which we retrieve from the jargs and cargs inputs. jargs are Job Arguments, which have arguments specific to a job, such as which chunk of the buffer to process. cargs are Command Arguments that contain information common to all jobs, it is unused here due to the simplicity of the command but can be usefull at times if Jobs share common arguments to read from.
Finally, the implementation consists of approximately 80 lines of actual code and that's all you need for a Clear feature that will work on every block, for all formats, all colors models, all type depths, any number of channels, working asynchronously and in a multithreaded environment, with a simple API. Here is an exemple of use: