Application Data Manager (ADM) is an open source solution for processing large amounts of data in real time. In this segment, I describe the ADM process for automated data recycling, Sequences.

# Set Theory

ADM is rooted in set theory.

A set is a number of things of the same kind that belong or are used together.

- The rows of a table or view are a set.
- The columns of a table or view are a set.
- The tables, views, reports and scripts of a database are sets.
- The databases of the application data model (ADM) are a set.

## Enumerating Members of a Set

To enumerate a set is to specify one element after another.

One way to do this is to rearrange the order of the elements of the set.

To minimize overhead we will NOT use this approach.

Instead, let us create a set S of non-negative integer aliases for each element of the generic set of many elements M and require that the elements of M and S be in one-to-one correspondence with one another as defined below.

### Definitions

- Enumerand:
- an enumerated element of set M
- Alias:
- a non-negative integer member of set S in an immutable, one-to-one correspondence with an enumerand
- Sequence:
- a one-to-one mapping of S onto itself in some order.

To change enumerand access order we alter the

sequenceS and do nothing to the set M itself.

For every set M there is at least one sequence called the *order of appearance*.

The primary usage for the *order of appearance* is *element recycling*.

## Recycling Process

The following table illustrates the recycling process.

The column *step* enumerates the set of recycling process actions.

The column *sequence* illustrates a newly allocated sequence for a set of up to eight elements. Note that the numeric aliases have been initialized with the values 0 to 7. They are show in green because they have never been used.

The column *capacity *is the maximum number of elements that can be accessed without additional memory allocation.

The column *hi-water* is the maximum number of elements ever activated.

The column *unused* is the difference between hi-water and capacity and represents the number of elements that have never been used and is shown in green

The column *length* is the number of active elements and is shown in yellow.

The column *recycled* is the difference between hi-water and length and is shown in red.

The column *available* is the number of elements available for set expansion and is the difference between *capacity* and *length*.

Step 2: to append an element, simply increment the length and use the alias. Yellow signals that the alias is active.

Step 3: after appending five more elements; six elements are in use (yellow) and there is availability for two more elements (green).

Step 4: element 3 is no longer needed and has been marked for recycling.

Step 5: a left cyclic permutation (x<-3<-4<-5<-x) and decrement in length inactivates and recycles element 3 (red). Note that the hi-water mark does not change as elements are recycled.

Step 6: element 1 is no longer needed and has been marked for recycling.

Step 7: a left cyclic permutation (x<-1<-2<-4<-5<-3<-x) and decrement in length inactivates and recycles element 1 (red).

Step 8: a new element is appended reactivating element 3. In set M the object aliased by element 3 is removed and replaced.

Step 9: a new element is appended reactivating element 1. In set M the object aliased by element 1 is removed and replaced.

Step 10: two new elements are appended and the sequence capacity is reached.

In my next blog I will describe the unconstrained arrays that allow sequences to grow incrementally up to a maximum capacity of 16,777,216 elements.

Copyright © 2014 Color My Data, All Rights Reserved