| home | specifications | WIJIS URIs | gateway | CDCL | GJXDM example | warrants/po exchange | wijis articles |
This is the specification for the lexical and syntactic constructs of the components of CDCL.
CDCL has what is known as an "authoring form". This authoring form is what people use to write and read disclosure control policy. People use CDCL authoring form to create rulesheets that declare and express policy rules.
This lexical discussion covers the authoring form's mechanisms by which rulesheets are organized and constructed so that policy may be uniformly interpreted. In contrast, the page on Booliette discusses a formalized language for non-technical personnel to express Boolean algebra in a way that is easier to understand than traditional means of expression. Booliette is intended to be an approximation of a natural language so that ease of use, clarity, and precision share equal importance.
Booliette is the most important member of the set of mechanisms CDCL authoring form employs in its grammar. So, suffice it to say that CDCL contains Booliette, while the converse is not true. Booliette along with the remainder of the non-trivial mechanisms are formally described in the grammar specification.
Policy authors will use Booliette within CDCL authoring form to declare the rules' stipulations, from the simple to the complex as appropriate, which must be observed and honored in order to enforce proper outcomes during data disclosure events. It's possible that one eventually may find uses for Booliette outside the domain of CDCL authoring form.
Some readers may be experts in related fields and will likely have expectations about CDCL based on prior experience in those fields. Therefore it may help to have an understanding of CDCL first in order to examine rulesheets. One may choose to review the fast-track CDCL introduction and save reading this site's content for a later time. Either way, several rulesheet examples are offered below.
For this discussion of CDCL Authoring Form, it may be helpful to become familiar with some example rulesheets.
It is critical for authors to understand that the policies they write will be interpreted in a CDCL Gatepoint from a specific perspective. That is one in which the document under redaction is examined node by node (i.e. element by element) and, from the point of evaluation of each node, the policies are examined to determine applicable rules and their outcomes. This is significantly different from the unsupported perspective of beginning with the policies' rules one-by-one and thereafter poring over the document to achieve policy compliance for each given rule. Each perspective lends itself to a different style of drafting rules. Although the proposed vocabulary of CDCL-purposed keywords in Booliette could be used for either style, some words would be used far more heavily in one style than the other. Since some of these words represent searches through the document, a style that relies heavily on words that cause these searches would be redundant at best and could negatively impact performance at worst. The unsupported perspective's style would do just that. Therefore, with an author's understanding of the supported perspective, it ought to be clear that words for searching (like "present-document contains") should be used sparingly in favor of reliance on the word "present-item" (synonymous with "content").
In other words, it could be a dangerous intuition by an author to believe that one could define a rule's Nodeset specification (the for-content block) without citing "present-item"/"any-item" or "content"/"all-content" because such a rule would be redundantly evaluated against all the document's nodes for every node-by-node visit the Gatepoint makes any way. Of course, this could be legitimate at times, but the likely rate of mistakes is far too high to ignore.
Furthermore, "present-item" and its synonyms (e.g. "content") must not exist anywhere outside a rule's Nodeset specification (the for-content block). We have felt for the purpose of clarity, which is an important facet of the entire CDCL solution, that a rule would have both a Nodeset specification and a separate condition/user specification (i.e. the Condition-set). If the "present-item" subject were allowed to exist in propositions anywhere within a rule, particularly within a Condition-set specification (the for-conditions block), there would be no sense in having these two separate areas of specification (i.e. the two kinds of blocks). We feel elimination of differentiating these two blocks by allowing use of "present-item" anywhere within the rule would negatively impact rule clarity.
Just as important as clarity, maintaining both the separation of these two specifications (Nodeset and Condition-set) and the prohibition on citing "present-item" outside the Nodeset specification allows CDCL to log a serialized representation of the combined and evaluated document/policy that exhibits rule applicability. For auditing purposes, this will be a great benefit by relieving auditors from the necessity to manually evaluate the policy against the document. And should an auditor choose to conduct manual evaluation, this affords an opportunity to verify that a Gatepoint is operating correctly.
CDCL authoring form offers its own aliasing feature that is different from Booliette's aliasing mechanism, but both mechanisms behave the same way. (In brief, it's a purely lexical substitution that occurs, logically speaking, before any processing takes place.)
Aliasing permits the substitution of simple character sequences for unwieldy, visually dense textual constructs such as URIs, XML literals, SQL queries etc. Alaising also gives the rulesheet author the means by which policy may be expressed in a localized dialect suitable for that author and that policy's stakeholder.
The scope of a CDCL alias declaration is the Rulesheet. CDCL alias resolution occurs
before any Booliette alias resolution occurs. The CDCL authoring form alias feature has
the following convention:
The word alias on its own line. On the following line, indented one level,
the word replace: occurs on its own line. On the following line, indented
two levels is the character sequence to be removed on its own line; this is tycpially the shortened
form found throughout the rulesheet. On the following line,
indented higher at only one level (the same as for 'replace'), the word with:
occurs on its own line. On the following line, indented two levels is the character sequence
to be used instead of the removed sequence; this is typically the longer, unwieldy form.
Although the authoring form's alias feature can give a rulesheet author a tool for expressing identical policy logic using different words, use of alias is restricted to within the rulesheet in which it appears. Often, globally available synonyms for authoring form keywords will be useful, and for this purpose, there is the CDCL thesaurus.
In order to achieve support for synonyms of authoring form organizational constructs, such as when translating the constructs to Spanish, French, or any language (e.g. keyword "rule" in English to "regla" in Spanish), CDCL relies on the thesaurus of keywords. It may be too high a security risk to use an approach that would dynamically determine the thesaurus through a service. So, under consideration is the merit and cost of bundling a thesaurus with CDCL deployments.
A thesaurus is a special-purpose variant of a Semantic Registry. A CDCL implementation may be configured to know of a thesaurus (zero to many of them). For more information about CDCL implementations of Gatepoints, Rulesheet Repositories, Rulesheet Editing, and Syntax Checking, see the sections on decision making process flow and collation.
CDCL recognizes many information entities whose values are determined at runtime. These entities may be found within four different explicit contexts:
Although the aliases' values are not known until runtime (and in fact may wind up being UNDEFINED at runtime), these items may be referenced in CDCL Rules. Accordingly, CDCL reserves the aforementioned alias names for the respective runtime entities. These aliases are undeclared, because the declaration of an alias presupposes an exact knowledge of the value being aliased. These special aliases will assume values supplied to the CDCL Gatepoint processor at runtime.
CDCL is built of several discrete components, sequenced or nested as the case may be:
refuse-context-containing-delegation
refuse-context-containing-multiple-users
* all-true
* current-time has-semantic business-hours
* outcome-order-of-priority has-value 'deny-overrides-redact-overrides-hold-overrides-disclose'
* business-purpose has-semantic federal-emergency-management-training
Validation of rulesheets involves checks for compliance of both syntax and logical coherence.
There is no guarantee of the order in which rules are evaluated.
The order of precedence of rules is not the same thing as the order in which rules are evaluated. The order of rule precedence is a future feature of CDCL rule authoring form, of CDCL rule fundamental form, and of outcome decision making. This feature is intended to satisfy Condition Scope Control.
In authoring form, one may dictate the precedence of one to many rules over another rule by using the unless keyword. For example:
rule
id:
7
apply-outcomes:
disclose
for-content:
* content has-semantic PII
for-conditions:
* all-true
* inherent-role-list has-semantic auditor
* current-time is-greater-scalar-value-than "15:00:00"
rule
id:
8
unless:
3 or 4 or 7 applies
apply-outcomes:
disclose-and-hold-for-review
immediate-reply-to-recipient:
The document is being held for manual review prior to a disclosure decision.
Please contact my.reviewer@thefed.gov for review results.
address-list:
my.reviewer@thefed.gov
subject:
Gatepoint Hold for Review
body:
See attachment. Sincerely, your friendly neighborhood Gatepoint.
for-content:
* content has-semantic PII
for-conditions:
* inherent-role-list has-semantic auditor
In outcome decision making, evaluation of a specific rule would be skipped entirely
if it were determined that it occupied any position other than the first position in
a sequence - unless the rule being evaluated were under immediate consideration within
a precedence resolution operation, such as in the case when the rule in first position
were found inapplicable, whereupon the rule or rules in secondary position are considered.
For example, let's say the very first rule to be evaluated were rule 8 above. It would be
skipped because it occupies a lower precedence than rules 3, 4, 7. Some time later, rule
7 is evaluated and determined to be inapplicable. Likewise for rules 4 and 3. Following the
sequence of precedence to rule 8, rule 8 is then not skipped at this moment and is
evaluated for applicability. But this had to wait until rules 3, 4, and 7 were found to be
inapplicable. (Note that in following the sequence of precedence, there may be more rules
than just 8 to seek out and evaluate.)
The logical model described above would be written in CDCL Authoring Form as follows. This listing is annotated below with a more formal treatment of the language elements. CDCL keywords are highlighted like this; if they are links, clicking on them shows definition or discussion.
The characters : { } = are literals, required to be in the code
as shown. Remarks in parentheses are descriptions or discussions; neither the parentheses nor their content
form a part of CDCL syntax.
Booliette syntax is discussed on its own page. None of the syntactic requirements of CDCL are applicable to the Booliette elements; they follow their own rules.
include doctype: (alias or full URI) primary-custodian: (alias or full URI) revision: (fragment identifier) under-these-conditions: (booliette statements about stake) alias replace: (fragment identifier) with: (fragment identifier) default-rule id: (fragment identifier) apply-outcomes: (outcome alias or full URI) rule id: (fragment identifier) apply-outcomes: (outcome alias or full URI) for-content: (booliette statements about present node or present document) for-conditions: (booliette statements about recipient user; "present-item" alias not permitted) rule id: (fragment identifier) apply-outcomes: (outcome alias or full URI) for-content: (booliette statements about present node or present document) for-conditions: (booliette statements about recipient user; "present-item" alias not permitted) (...and so on. the number of rules in a rulesheet is unrestricted.)