* An LDB Mapping Module                 -*- outline -*-

** Scenario
Usage of an instance of the map module requires tha full set of data
to be split in two "partitions", a local (fallback) and a remote
(mapped) one.  The remote partition will be used to store additional
data but may differ from the local partition in the schema it uses.
The goal is to store as much data as possible on the remote partition.

A user is required to define mappings between local and remote data
schemas to account for these differences, but also to enable usage of
the remote partition for as large a dataset as possible.

Data not explicitly accounted for in the defined mappings is ignored
by the remote partition and stored on the local partition instead.


** Assumptions

*** Mapped vs. Not Mapped
Each record exists in either of two states:

**** Not mapped:
The record and all of its data are stored on the local partition; the
additional attribute "isMapped" is not present in the record.

**** Mapped:
The record is split into a local and a remote part stored on the local
and remote partition respectively; an additional internal attribute
"isMapped" is added to the local part of the record.

The local part of a mapped record is allowed to consist only of the
"isMapped" attribute, denoting the case when all "real" data is mapped
to the remote partition.

*** Internal attribute "isMapped"
The local part of each mapped record includes an internal attribute
"isMapped", the value of which is irrelevant.  It is not present in
the local part of a record that is not mapped.

*** Local != Fallback
Each attribute must either be mapped or ignored; the local part should
be reserved for ignored data and not serve as fallback storage for
failed mappings.

Failure to map an attribute indicates a problem with the specified
mappings and should be reported.

*** No smart mapping
The map module has no knowledge of constraints present in the backend
of the remote partition, such as objectClass restrictions.

Constraint violation results from the remote backend indicate a
problem with the specified mappings and should be reported.

*** Uniqueness of local names
The specified mappings should be uniquely indexable by the local name
of a handled attribute.

Generation of multiple remote attributes from a single local attribute
will be possible but should be defined in a single mapping to ease
conversions from the remote to the local format (as for search
responses).

(This might be problematic for attributes that form a records RDN, as
only one attribute can be used for that at a time...)

*** Stable naming contexts
Mappings must not tamper with the naming context of a record.  The
naming context is (part of) what specifies the destination partition
for a datase; its modification is defined globally in a special record
of the map instance.


** Alternatives to local "isMapped"

*** Local "isMappedTo"
Instead of setting a local attribute to just denote whether a record
is mapped or not, the attribute could contain the DN of the remote
part if the records data.  When looking for the attribute, its
contents could be extracted and instantly reused to access remote
data.

*** Remote "isMappedFrom"
Instead of the local part pointing to the remote part, the latter
could be assigned an attribute containing the DN of the local part of
the data.  Finding the remote part of a record with a local DN would
then consist of searching for a record that claims to be mapped from
that local DN.

*** More GUIDs
Use GUIDs instead of DNs to identify related records.

*** Another "masterGUID=%" RDN
The idea from local_password could be reused.  This would also save us
from mapping DNs back and forth all over the place.


** Problems

*** with these alternatives to "isMapped"
The obvious problem is the use of a local/remote schema which might
prevent our use of arbitrary internal attributes for mapped data.
Another is mapping the directory structure in the remote backend,
which the third alternative would prevent.
We would also need to be in samdb to get to use GUIDs for all I
understand.

*** with partitions
When I use partitions (or really just different baseDNs) to separate
the local and remote records, I *should* make sure to only modify DNs
under the local baseDN.  What should one do in the case of a rename
request where one of the old and new DN is under the local baseDN but
the other isn't?

*** with object existence
The premises include that objects are always created locally, even if
only to hold a "isMapped" attribute, but that they don't need to exist
remotely.  The structure of the remote partition exactly mirrors that
of the local partition though, i.e. an object has the same DN relative
to the local as to the remote baseDN (except for mappings of the DN
components, but those don't matter here).  This implies that *all*
containers along this path must exist both locally *and* remotely,
regardless of their attribute components.

This means it *could* be possible to add a container (which isn't
mapped for some reason) and later to create a record below that
container that contains mapped data.  The request will fail due to the
missing container, which, from the users point of view, *does* exist.


** Mappings

*** Abstractly
A mapping represents, viewed abstractly, a function of type
Request x Context --> Request

The structure of LDB modules performs dispatching based on request
type itself, so we can substitute the request contents for Request in
that type.  Disregarding delete and rename requests, which contain
only one or two target DNs, the contents of a request are of type LDB
message, so we assume a type
Message x Context --> Request

The ldb_map module disregards context completely, modelling only
functions 
Message --> Message

and a few restricted subforms of types
Message Element --> Message Element

or even just
String --> String

This is convenient for most cases and increases expressiveness of the
"mapping definition language" but limits the overall power of
mappings, as e.g. the password_hash module cannot currently be
expressed with the ldb_map module due to lacking context (the domain
data search result and the array `attrs' of the
`password_hash_mod_search_self' function.

The only way to access that data is by making (synchronous) LDB calls
from within the attribute generation function.

*** As Balls of Mud
The canonical way to fetch context is to prefix the "real" request
with a search request that results in the required data and is then
stuffed in the requests async context.

This pattern could of course be factored into the map module; mappings
requiring this data would then need ways to request it, either as an
array of strings (for the simpler case where additional info about the
record itself is required) or as an array of (search) requests (for
the general case, allowing the whole database to be queried).


** Requests

*** Delete
look for remote record
if found:
  delete remote record
delete local record

*** Rename
look for remote record
if found:
  rename remote record
rename local record

*** Add
for each attribute:
  if it is local:
    request it locally
  if it doesn't require context:
    map it
    request it remotely
  otherwise:
    register requested query
if query registered:
  get context from query result
if remote record requested:
  add remote record with context
  add local record with "isMapped"
otherwise:
  add local record w/o "isMapped"

This doesn't make sense: we're trying to *add* the remote record, so
we can't possibly query it before adding it!
Our only choices are to either find out immediately which mappings
cannot be performed due to missing context data or to split the remote
message into an immediate and a postponed part, the immediate one
being run *right now* and the postponed, well, postponed until after
that; the postponed message would then have to be split up again and
again until either the immediate part is empty (i.e. some data cannot
be added) or the postponed part is empty (i.e. we're done and can go
home now).

*** Modify
if no requested changes are remote:
  just run local request
look for remote record
if not found:
  turn remote request into "add"
  add remote record (with context)
  register relation between local and remote record
otherwise:
  modify remote record (with context)
  modify local record

*** Search
if no requested attribute is remote or "*":
  just run local request
otherwise:
  run query with local attributes and "isMapped"
if "isMapped" is not set and remote attribues were requested:
  abort
otherwise:
  query remote DN with remote attributes
unmap remote result
merge local and remote result
remove "isMapped" from result


Hmm... Assume that both the local and remote parts of a split parse
tree are non-null.  Assume we are in the async callback of the local
search, after having found a local record matching the local parse
tree, and are preparing the search for the matching remote record.
That remote record must match the remote parse tree so the original
parse tree matches the merged record, so we should be able to search
for the matching remote DN *and* the remote parse tree.  Right?

So, we'll consider these cases:

- there is no parse tree:
  - just search as we did before

- the parse tree is local:
  - use the original parse tree for the local search
  - continue as before

- the parse tree is remote:
  - search locally as before
  - use the mapped parse tree for remote search
  - on match:
    - merge remote result into local result
  - otherwise:
    - skip local result

- parse tree can be split:
  - use the local parse tree for the local search
  - use the mapped remote parse tree for remote search
  - on match:
    - merge remote result into local result
  - otherwise:
    - skip local result