3. Scalar models: the Cell classes


3.1 Introduction

3.1.1 Terminology

The English word cell comes from the Latin cella meaning "chamber", from which we also get our word cellar. These are distantly related to conceal via the Latin celare, "to hide", and color, "to cover". All are derived from the ancient Indo-European root [-kel], which means "to conceal, cover, hide". In biology, a cell is "the smallest structural unit of an organism capable of independent functioning." Incidentally, living cells exhibit the OO principles of encapsulation, inheritance, polymorphism, and message-based interfaces (but that is another essay for another time). (Also, the etymology of words is a great example of the inheritance principle, but this is yet another essay.) In spreadsheet applications, cells are boxes arranged in rows and columns, into which one can enter a single piece of data. A spreadsheet cell is analogous to a field in database management systems.

In this thought experiment, we define a Cell class, which is a scalar model, by which we mean that it wraps or encapsulates a value that cannot be generated from more elemental values. The Cell class at once hides the value and exposes it in a well-defined manner. It also conceals a great deal of additional functionality.

3.1.2 Incremental definition

3.1.2.1 Technique

The general technique of evolving a framework was introduced in a previous section. More specifically, the Cell class is evolved in four or five phases as described immediately below. Within each phase, one or two methods are defined and tested at a time. The definitions are displayed in boxes with a white background, while the unit tests are given a green background. For the most part these are presented in the order in which they were originally investigated. Further explanation is provided where deemed helpful.

3.1.2.2 Phases of the process

3.1.2.3 Initial import statements

The investigation is carried out as a suite of unit tests for classes defined in line with the unit tests. The test module begins, therefore, by importing the unittest module. We also import copy and cPickle, both of which are required in Phase Four.

import unittest, copy
import cPickle as pickle

Now we are ready to begin Phase One, defining the essential interfaces.

3.2 Four essential interfaces

  1. A cell is a variable
  2. A cell is observable
  3. A cell is a node
  4. A cell is constrained

3.2.1 A cell is a variable

3.2.1.1 Attribute assignment and access

Above all else, the Cell class defines a variable: each cell instance holds a value that can be changed through some operation. Thus, the simplest possible definition for a Cell class is:

class Cell_000:
    """Bare bones Cell class; nothing but an attribute."""
    def __init__(self):
        self.value = None

The operation to change the value is attribute assignment, and the operation to read the value is simply attribute access.

class Test000(unittest.TestCase):
    """The *variable* interface for Cell class"""

    def test000(self):
        """pre Cell: bare value"""
        x = Cell_000()
        self.assertEqual(x.value, None)
        x.value = 'testval1'  # attribute assignment
        val = x.value         # attribute access
        self.assertEqual(val, 'testval1')
        x.value = 123.45
        self.assertEqual(x.value, 123.45)

The first question is why bother with the attribute? Why not assign and access a value directly?

        x = 'testval1'        # value assignment
        val = x               # value access
        self.assertEqual(val, 'testval1')

The answer is that value assignment is too direct. The value is too exposed. We are looking for a way to conceal a value so that there is better control over assignment and access. Unfortunately, even attribute access is too direct. What we need is a way to access an attribute indirectly through a set of custom methods defined for that purpose.

3.2.1.2 The 'value' property

With Python 2.2 and later, it is possible to simulate attribute access with a feature called properties. A property is an attribute with accessor methods (get and set), which provide an additional layer of control. Property assignment goes through the "setter" method (designated with the fset parameter) and property access goes through the disignated "getter" method (fget parameter).

class Cell_001(object):
    """Minimal Cell class, start evolution from here."""
    def __init__(self):
        self._value = None

    def Get(self):
        return self._value

    def Set(self, value):
        self._value = value

    value = property(fget=Get, fset=Set)

To the programmer, the value property looks just like the bare 'value' attribute above, and the unit tests are exactly the same. (Note that the Cell class must be derived from object for properties to work.)

    # class Test000 (continued)
    def test001(self):
        """pre Cell: property value"""
        x = Cell_001()
        self.assertEqual(x.value, None)
        x.value = 'testval1'  # property assignment
        val = x.value         # property access
        self.assertEqual(val, 'testval1')
        x.value = 123.45
        self.assertEqual(x.value, 123.45)

    # test002 and test003 not published here (redundant)

Thus, the variable interface has evolved from:

3.2.1.3 Datatypes

By itself, however, the property definition above buys us little, if anything, since the defined accessor methods do precisely what direct attribute access did before. What it does give us, however, is a framework for revising the Set method so that (if necessary) the value can be changed to another representation before assigning it. It does so by calling another method, Convert, which may be overridden in subclasses to define the preferred representation. For the Cell base class the representation is merely returned unchanged.

class Cell_002(object):
    """Cell that converts input to 'actual' representation."""
    def __init__(self):
        self._value = None

    def Get(self):
        return self._value

    def Set(self, value):
        self._value = self.Convert(value)

    def Convert(self, value):
        return value

    value = property(fget=Get, fset=Set)

    # class Test000 (continued)
    def test004(self):
        """pre Cell: with generic Convert"""
        x = Cell_002()
        self.assertEqual(x.value, None)
        x.value = '123.45'
        self.assertEqual(x.value, '123.45')
        x.value = 123.45
        self.assertEqual(x.value, 123.45)

Then we can define subclasses that override the Convert method to change the input representation to a given datatype (such as string or floating point).

class Cell_003(Cell_002):
    """Pre StringCell: actual representation is string"""
    def Convert(self, value):
        return str(value)

class Cell_004(Cell_002):
    """Pre NumberCell: actual representation is float"""
    def Convert(self, value):
        return float(value)

    # class Test000 (continued)
    def test005(self):
        """pre Cell: with string Convert"""
        x = Cell_003()
        self.assertEqual(x.value, None)
        x.value = '123.45'
        self.assertEqual(x.value, '123.45')
        x.value = 123.45
        self.assertEqual(x.value, '123.45')

    def test006(self):
        """pre Cell: with number Convert"""
        x = Cell_004()
        self.assertEqual(x.value, None)
        x.value = '123.45'
        self.assertEqual(x.value, 123.45)
        x.value = 123.45
        self.assertEqual(x.value, 123.45)
        self.assertRaises(ValueError, x.Set, 'hello')

This is ok as long as the value can be converted to the desired representation, but what if it cannot? The Convert method(s) should be expanded to catch exceptions, returning None (or other fixed default value) if the conversion fails.

class Cell_005(Cell_002):
    """Pre NumberCell: on exception value converted to None"""
    def Convert(self, value):
        try:
            return float(value)
        except (ValueError, TypeError):
            return None

    # class Test000 (continued)
    def test007(self):
        """pre Cell: with safe number Convert"""
        x = Cell_005()
        self.assertEqual(x.value, None)
        x.value = '123.45'
        self.assertEqual(x.value, 123.45)
        x.value = 123.45
        self.assertEqual(x.value, 123.45)
        x.value = 'hello'
        self.assertEqual(x.value, None)

Two more comments are in order before going on. First, it should be remarked that returning None in the case of a failed type conversion does not necessarily commit us to three-value logic. Other options are discussed in the section on constraints. Second, there is no reason why the Convert method must be bound to an instance. It can just as easily be a static method, providing more flexibility in its use. (See Cell_035.)

3.2.2 A cell is observable

In event-driven applications, each cell must be observable (see Rationale). When we get around to implementing presenters, we will allow a view to modify the value of a cell-model directly, but the view must also respond to changes in the value made by other parts of the application. This is done through the observable-observer pattern. The observable-observer pattern will also be used in creating dependent variables.

There are a number of different ways we might implement this pattern, so the definitions here should be considered little more than pseudocode, possibly to be replaced later. The following code is about as simple as possible. It does not look for duplicate observers, for example, nor does it provide a way to remove observers. Nevertheless, it will suffice nicely for our thought experiment.

class Cell_010(Cell_002):
    """Cell that is an observable."""
    def __init__(self):
        self._observers = []
        self._value = None

    def AddObserver(self, observer):
        self._observers.append(observer)

    def NotifyObservers(self):
        for o in self._observers:
            o.Update(source=self)

Having defined a NotifyObservers method, we may now modify the Cell.Set method to notify its observers whenever a change in value occurs.

class Cell_012(Cell_010):

    def Get(self):
        return self._value

    def Set(self, value):
        value = self.Convert(value)
        if value != self._value:
            self._value = value
            self.NotifyObservers()

    value = property(fget=Get, fset=Set)

Observers are notified only when the value is changed, not (necessarily) each time the Set method is called.

Each observer must define a method Update(self, source), which updates the observer in response to being notified by the observable. Note that the observable pushes a reference to itself (as the source) to the observer, but nothing else. We could do less (push nothing at all) or do more (push the value as well), but for now we take the middle road with a push-pull notification mechanism (see Parameter passing).

class MyObserver:
    """Primitive observer for testing"""
    def Update(self, source):
        self._check = source.value

    def See(self):
        return self._check

class Test010(unittest.TestCase):
    """The *observable* interface for Cell class"""

    def test001(self):
        """pre Cell: AddObserver NotifyObservers"""
        x = Cell_012()
        y = MyObserver()
        x.AddObserver(y)
        x.value = 'change val'
        self.assertEqual(y.See(), 'change val')

One additional comment is in order here. The AddObserver method is part of the public API, at least at this point in the thought experiment, but our goal is to make the final syntax such that in practice this method gets called automatically. In other words its action(s) should be entirely transparent to the programmer. (See Automatic registration.)

3.2.3 A cell is a node

Cells are intended to be leaf nodes in a tree graph. A tree is not a tree, however, without branching nodes, so a full accounting of the design must await our investigation of branching nodes (composite models). Nevertheless, a few preliminary comments are in order at this point.

3.2.3.1 Parent

class Cell_020(Cell_012):
    """Cell that is a node."""
    def __init__(self, parent=None):
        self._parent = parent
        self._observers = []
        self._value = None

    def GetParent(self):
        return self._parent


class AParent: pass

The notation is somewhat reminiscent of the component binding syntax of the ui GUI toolkit, but there is one big difference. The uiGUI implementation (over which we hypothetically have no control) was designed for both a procedural style and an OO style of composition. Thus in ui the child takes responsibility for registering itself with the parent. By contrast, the abstraction-layer (over which we do have control) is designed for an OO style of composition exclusively, so each parent (each Structure) is responsible for adding its own children.

We might ask, appropriately enough, why the child needs to know its parent anyway. Might this not be an example of the YAGNI principle? That is a possibility, but two ideas come to mind. First, a child may want to mark itself and its ancestors as having changed value. Second, a child may want to notify its observers and those of its ancestors of a value-change event.

3.2.3.2 Name

Since each cell belongs to a labeled tree graph, it will necessarily have a name (label) within the parent node. There is no automatic technique of introspection to get this name from the child node itself, so the following code provides this capability.

class Cell_022(Cell_020):
    """Cell that is a node."""
    def __init__(self, parent=None, name=''):
        self._parent = parent
        self._name = name
        self._observers = []
        self._value = None

    def GetName(self):
        return self._name

class Test020(unittest.TestCase):
    """The *node* interface for Cell class"""

    def test001(self):
        """pre Cell: GetParent, GetName"""
        p = AParent()
        x = Cell_022(parent=p, name='somename')
        self.assertEqual(x.GetParent(), p)
        self.assertEqual(x.GetName(), 'somename')

3.2.3.3 No implementation assumptions

Note that the GetParent and GetName definitions make no assumptions about how the parent implements its internal naming and binding mechanism. Neither do these methods make any a priori assumptions about their own caching mechanisms. All that is required is that they return: (1) a reference to the parent object, or (2) the name string respectively.

3.2.4 A cell is constrained

3.2.4.1 Class attributes

Constraints are rules we write into the design of the abstraction layer (models) of our application that help enforce the integrity of the data. We begin by making the simplifying assumption that all constraints are class-based. This is not necessarily true, but by deferring our investigation of instance-based constraints, we may gain better traction toward our immediate goals. Moreover, a class attribute is the quickest way to begin working with class-based constraints, and since instance attributes override class attributes, instance-based constraints using attributes should be easy enough to add on later. The most direct way to do this is with a simple class attribute, something like this:

class Cell_030(Cell_022):
    """Cell that has a constraint (_default)."""
    _default = None

    def GetDefault(self):
        return self._default

    def Reset(self):
        self.Set(self.GetDefault())

class Test030(unittest.TestCase):
    """The *constraint* interface for Cell class"""

    def test001(self):
        """pre Cell: GetDefault, Reset"""
        x = Cell_030()
        self.assertEqual(x.GetDefault(), None)
        x.value = 'ujikol'
        self.assertEqual(x.value, 'ujikol')
        x.Reset()
        self.assertEqual(x.value, None)

Then each datatype can be given a general default value, if we wish, just by assigning that value to the class attribute. Here, for example, Cell_031 is a StringCell and Cell_032 is a NumberCell.

class Cell_031(Cell_030):
    _default = ''


class Cell_032(Cell_030):
    _default = 0.0

    # class Test030 (continued)
    def test002(self):
        """pre Cell: GetDefault, Reset"""
        x = Cell_031()
        self.assertEqual(x.GetDefault(), '')
        x.value = 'ujikol'
        self.assertEqual(x.value, 'ujikol')
        x.Reset()
        self.assertEqual(x.value, '')

    def test003(self):
        """pre Cell: GetDefault, Reset"""
        x = Cell_032()
        self.assertEqual(x.GetDefault(), 0.0)
        x.value = 'ujikol'
        self.assertEqual(x.value, 'ujikol')
        x.Reset()
        self.assertEqual(x.value, 0.0)

3.2.4.2 Credulous versus skeptical constraint notation

This way of defining a constraint is simple and effective, but it is also naive. It trusts the programmer-user to supply only valid default values, and in the proper representation for the given datatype. What if the programmer defines a default value this way:

class FluidRate(Cell_032):
    _default = 'Enter desired fluid rate'

... or even a valid value but wrong representation:

class FluidRate(Cell_032):
    _default = '100.0'

Under some circumstances these might be considered appropriate, but more often than not the default value will be used for more than just the initial text in a text field, for example. A default of None can be used for some version of three-value logic, and other values may be used for default logic. It would be possible to check the default value using Convert every time it is used, but this may or may not be adquate depending on how the constraint is used. Usually we want to assure that the stored default value is (1) a valid value, and (2) in the preferred (actual) representation, and this means being a bit more skeptical about the value given in the class definition.

The problem to be solved is demonstrated in the following unit test.

class Cell_033(Cell_030):
    """type is float, default thus invalid"""
    _default = 'yes'

    def Convert(value):
        return float(value)

    Convert = staticmethod(Convert)

    # class Test030 (continued)
    def test004(self):
        """pre Cell: invalid default provided"""
        x = Cell_033()
        # ValueError: invalid literal for float(): yes
        self.assertRaises(ValueError, x.Reset)

3.2.4.3 Class methods

The first step in creating a skeptical notation for constraints is to define a class method to assign the corresponding class attribute. For the default value, for example, we might define the following SetDefault class method:

class Cell_035(Cell_030):
    """Add a class method to SetDefault (check Convert first)"""
    def Convert(value):
        try:
            return float(value)
        except (ValueError, TypeError):
            return None

    def SetDefault(cls, value):
        safevalue = cls.Convert(value)
        cls._default = safevalue

    Convert = staticmethod(Convert)
    SetDefault = classmethod(SetDefault)

Then to establish a new "default value" constraint for new Cell lines, following notation is used.

class Cell_036(Cell_035): pass
Cell_036.SetDefault('100.0')

class Cell_037(Cell_035): pass
Cell_037.SetDefault('hello')

For Cell_036, the default supplied is valid and therefore all is well. For Cell_037, the given default is not valid, so the default is set to None as demonstrated in this unit test.

    # class Test030 (continued)
    def test005(self):
        """pre Cell: SetDefault class method"""
        x = Cell_036()
        x.Reset()
        self.assertEqual(x.value, 100.0)
        y = Cell_037()
        y.Reset()
        self.assertEqual(y.value, None)

At best, however, the above notation is ugly, really ugly, and at worst it is confusing. Much preferable would be a clean notation like the class attribute syntax ...

class FluidRate(Cell):
    default = 100.0

... but one that is nonetheless skeptical in its implementation. To make such a syntax work, it is necessary to refactor constraints, taking advantage of Python's metaclass feature.

3.3 Refactored constraints

3.3.1 The Constrained metaclass

Here again we follow the simplifying assumption that constraints apply to a class, as opposed to specific instances of a class. In other words (and for the time being), all instances of a given class are to be subject to the same constraints. What we are looking for is a way to use class methods, such as SetDefault demonstrated above, to test the constraint values provided for validity (or otherwise manipulate the value) while still using the class attribute notation. Although there may be better ways to accomplish this, one way is to change the metaclass for Cell so that it looks for a given class attribute in the class definition, and finding such an attribute, it applies a corresponding class method to test or otherwise compute the value provided.

First we define such a metaclass:

class Constrained_000(type):
    """Meta-class to provide 'specialization by constraint'."""
    def __init__(cls, name, bases, dict):
        """At initialization of class, check class attribute"""
        super(Constrained_000, cls).__init__(name, bases, dict)
        if 'default' in dict:
            cls.SetDefault(dict['default'])

Then we associate this metaclass with the class:

class Cell_040(Cell_035):
    """Using the Constrained metaclass"""
    __metaclass__ = Constrained_000
    default = None

    def Reset(self):
        self.Set(self.default)

    def SetDefault(cls, value):
        safevalue = cls.Convert(value)
        cls.default = safevalue

    SetDefault = classmethod(SetDefault)

Now we can define a subclass, NumberCell A.K.A. Cell_045 for example, overriding the Convert method to validate or change the representation of the value supplied to the default attribute:

class Cell_045(Cell_040):
    """Pre NumberCell (using Constrained metaclass)"""
    def Convert(value):
        try:
            return float(value)
        except (ValueError, TypeError):
            return None

    Convert = staticmethod(Convert)


class Cell_046(Cell_045):
    default = '100.0'

class Cell_047(Cell_045):
    default = 'hello'

class Test040(unittest.TestCase):
    """The Constrained metaclass solution"""

    def test001(self):
        """pre Cell: Constrained metaclass"""
        x = Cell_046()
        x.Reset()
        self.assertEqual(x.value, 100.0)
        y = Cell_047()
        y.Reset()
        self.assertEqual(y.value, None)

3.3.2 The Constrain method

The shortcoming of the metaclass defined above is that the Cell base class may not "know" ahead of time all the constraints (class attribute-method pairs) needed by every subclass. This means that every subclass (Cell type) will need its own metaclass. An alternative is to define a metaclass for the Cell base class that calls a generic class method (say Constrain) to be overridden in each subclass to take care of the constraints for that subclass.

class Constrained_001(type):
    """Meta-class to provide 'specialization by constraint'."""
    def __init__(cls, name, bases, dict):
        """At initialization of class, check class attribute"""
        super(Constrained_001, cls).__init__(name, bases, dict)
        cls.Constrain(**dict)


class Cell_050(Cell_030):
    """Using the modified Constrained metaclass"""
    __metaclass__ = Constrained_001
    default = None

    def Convert(value):
        return value

    Convert = staticmethod(Convert)

    def GetDefault(self):
        return self.default

    def SetDefault(cls, value):
        safevalue = cls.Convert(value)
        cls.default = safevalue

    def Constrain(cls, **kw):
        if 'default' in kw:
            cls.SetDefault(kw['default'])
        # other contraints to follow here

    SetDefault = classmethod(SetDefault)
    Constrain = classmethod(Constrain)


class Cell_055(Cell_050):
    """Pre NumberCell (using Constrained metaclass)"""
    def Convert(value):
        try:
            return float(value)
        except (ValueError, TypeError):
            return None

    Convert = staticmethod(Convert)


class Cell_056(Cell_055):
    default = '100.0'

class Cell_057(Cell_055):
    default = 'hello'

class Test050(unittest.TestCase):
    """The Constrained metaclass solution"""

    def test001(self):
        """pre Cell: modified Constrained metaclass"""
        x = Cell_056()
        x.Reset()
        self.assertEqual(x.value, 100.0)
        y = Cell_057()
        y.Reset()
        self.assertEqual(y.value, None)

3.4 Dependent cells

Database applications frequently require the presentation of data elements that have been computed from other data elements. These secondary data are so called dependent variables, variables the values of which are dependent upon the values of other variables, some of which must obviously be independent. In the Abstraction-Presentation framework, the definitions for such dependent variables are intended to be maintained as part of the overall data model, and to require a syntax similar to that used for defining independent variables. (Note: when we get around to the persistence mechanism, dependent variables will not be stored, and therefore not retrieved, but rather will be calculated on the fly from the values of independent variables that are stored and retrieved.)

3.4.1 External references

We already have in place a mechanism that could be employed to synchronize dependent variables with their independent referents, namely the observable interface. Dependent cells could be made observers of other cells, all of which are observable. Keep in mind that the primary function of the observable-observer pattern in the AP framework is to synchronize presenters with their models, but why not take advantage of the same mechanism for synchronizing models with other models?

For a cell to be an observer it must be an instance of a subclass of Cell in which an Update method has been defined. This method, called via NotifyObservers whenever the value of the observable is changed, polls the observable(s) for the new value(s) and changes its own value as a result. For this to work, the Update instance method needs to know how to find the observables specified for this particular instance of the observer subclass. These external references are supplied as positional arguments to the constructor following the parent and name arguments (*args) and stored in a list instance attribute, _refs.

class Cell_060(Cell_050):
    """Allow Cell to reference to other cells"""
    def __init__(self, parent=None, name='', *args):
        self._parent = parent
        self._name = name
        self._observers = []
        self._refs = args
        self._value = None


class Cell_061(Cell_060):
    """Make Cell into potential Observer: a doubler"""
    def Update(self, source):
        self.value = self._refs[0].value * 2

class Test060(unittest.TestCase):
    """Observer cell tests"""

    def test001(self):
        """pre Observer: doubles value of the observable"""
        x = Cell_061()
        y = Cell_061(None, '', x)
        x.AddObserver(y)
        x.value = 12.0
        self.assertEqual(y.value, 24.0)

This mechanism has at least two short-comings, however. First, as it stands every observer must be manually (procedurally) registered with its observables. This problem is addressed with automated (object-oriented) registration below. The second issue is that this syntax for defining the Update method of each dependent cell type is cumbersome, hard to read, harder to test, and difficult to maintain.

3.4.2 The Calculate method

Compare the following method definitions. The first two, both Update, are cumbersome because : (1) the subscripted references are not intuitive, and (2) the Set/Get methods and the value property approaches both seem verbose. The Update methods must be bound to an instance, as they require self for access to self._refs. Furthermore, the Update methods (here) are defined by the user, but they require access to the "private" attribute self._refs.

    def Update(self, source):
        self.Set(self._refs[0].Get() + self._refs[1].Get())

    def Update(self, source):
        self.value = self._refs[0].value + self._refs[1].value


    def Calculate(main, other):
        return main + other

    Calculate = staticmethod(Calculate)

By contrast, the Calculate method is clean and simple. It merely returns a result calculated from the value arguments provided, and the corresponding parameter names can be descriptive. Since this method requires no reference to self, it can be made static. To test the Calculate method, therefore, one simply calls the function with appropriate input values and check the returned value. Testing the Update methods, however, requires one to instantiate two independent cells and a dependent cell, set the values of the first two cells, and then check the value of the third. All of these pragmatic differences between Update and Calculate are compounded many times over if the code for calculating the dependent variable is complicated, as opposed to the simple addition above.

The problem with Calculate, of course, is that it must be given the proper values when it is called, and in any case an observer must have an Update method. That is not negotiable. The answer is to supply an Update method in the base class that calls Calculate. Then the subclassed observer only need to override the Calculate method.

    def Update(self, source):
        args = [i.Get() for i in self._refs]
        self.Set(self.Calculate(*args))

The Update method is now part of the fixed framework, and the Calculate method is part of the elaboration thereof. Putting these ideas all together, and adding code to catch possible exceptions, we can go back and define our "doubler" subclass like this:

class Cell_062(Cell_060):
    """Allow Observer to define static method, Calculate"""
    def Update(self, source):
        try:
            args = [i.Get() for i in self._refs]
            self.Set(self.Calculate(*args))
        except (TypeError, ValueError, ZeroDivisionError,
                OverflowError, AttributeError):
            self.Reset()

class Cell_063(Cell_062):
    """Define static Calculate to chain calculations"""
    def Calculate(arg1):
        return arg1 * 2

    Calculate = staticmethod(Calculate)

    # class Test060 (continued)
    def test002(self):
        """pre Observer: doubles value using Calculate method"""
        x = Cell_063()
        y = Cell_063(None, '', x)
        self.assertEqual(y.Calculate(511.3), 1022.6)
        x.AddObserver(y)  # needed here
        x.value = 12.0
        self.assertEqual(y.value, 24.0)

Note that this unit test tests the entire observable-observer mechanism. The calculation itself could be tested with a single statement:

        self.assertEqual(Cell_063.Calculate(12.0), 24.0)

Notice also that manual (procedural) registration of the observer with the observable is still required above.

3.4.3 Automatic registration

Since the observer instance knows its own observables at the time of instantiation, however, why not let the dependent cell take care of its own registration with the observable(s)?

class Cell_064(Cell_062):
    """Register self as observer of refs"""
    def __init__(self, parent=None, name='', *args):
        self._parent = parent
        self._name = name
        self._observers = []
        self._refs = args
        for i in self._refs:
            i.AddObserver(self)
        self._value = None

class Cell_065(Cell_064):
    """Define static Calculate to chain calculations"""
    def Calculate(arg1):
        return arg1 * 2

    Calculate = staticmethod(Calculate)

    # class Test060 (continued)
    def test003(self):
        """pre Observer: observer registers itself"""
        x = Cell_065()
        y = Cell_065(None, '', x)
        self.assertEqual(y.Calculate(511.3), 1022.6)
##        x.AddObserver(y)  # no longer needed
        x.value = 12.0
        self.assertEqual(y.value, 24.0)

This is the object-oriented approach, as opposed to the procedural approach in test002. For a few dependent variables, the difference is hardly noticable, but for dozens or hundreds of such observers, the benefit is substantial.

Although it would be no great burden to ask the user-developer to declare as a staticmethod each Calculate method defined (for each dependent cell type), we can take advantage of our constraint interface such that any time a Calculate method is defined, it is automatically made static.

class Cell_066(Cell_064):
    """Automatically make Calculate into static method"""
    def Constrain(cls, **kw):
        if 'default' in kw:
            cls.SetDefault(kw['default'])
        if 'Calculate' in kw:
            fnc = kw['Calculate']
            setattr(cls, 'Calculate', staticmethod(fnc))

    Constrain = classmethod(Constrain)

class Cell_067(Cell_066):
    """Define Calculate to double value"""
    def Calculate(arg1):
        return arg1 * 2

class Cell_069(Cell_066):
    """Define Calculate to add numbers"""
    def Calculate(*args):
        sum = 0
        for addend in args:
            sum += addend
        return sum

    # class Test060 (continued)
    def test004(self):
        """pre Observer: Calculate automatically static"""
        x = Cell_067()
        y = Cell_067(None, '', x)
        self.assertEqual(y.Calculate(511.3), 1022.6)
        x.value = 12.0
        self.assertEqual(y.value, 24.0)

    def test005(self):
        """pre Observer: multiple arguments for Calculate"""
        x = Cell_069()
        y = Cell_069()
        z = Cell_069(None, '', x, y)
        self.assertEqual(z.Calculate(2, 3, 4), 9)
        x.value = 12.0
        y.value = 13.0
        self.assertEqual(z.value, 25.0)

The only (slight) disadvantage is that the only visual cue that Calculate is a static method is the absence of self in the signature.

3.5 Notion of state

At this point the Cell class has evolved to a high degree of sophistication, perhaps too much sophistication, it may be objected. This objection has two aspects, philosophical concerns discussed below and technical concerns discussed here.

3.5.1 Extracting state values

Sooner or later, models (including cells) must be made persistent. The ultimate goal of the AP framework, after all, is to make it possible for the end user to interact with data elements that can be stored and retrieved. The technical problems associated with a highly developed Cell base class arise primarily in the context of persistence, and more especially serialization.

The pickle module implements the algorithm for serializing and de-serializing a Python object structure. In the protocol for pickling normal class instances, if there is no __getstate__ method, the instance's __dict__ is pickled. For cells, however, this is inappropriate. We don't need to store any of the design-time constraints, for example, and we certainly don't want to store the _observers and _refs lists, even if (or especially if) they are empty. In fact, the only attribute that should be pickled is the cell's _value, so we simply provide the following __getstate__ method:

class Cell_070(Cell_066):
    """Cell can extract value for Transaction"""
    def __getstate__(self):
        return {'_value': self._value}

Since copy.deepcopy uses __getstate__ too, two different unit tests can be used, one with deepcopy and one with pickle. In both cases, the _value attribute (and its value) is retained, while _observers, for example, is not.

class Test070(unittest.TestCase):
    """Get state"""

    def test001(self):
        """pre Cell: deepcopy extracts value alone"""
        x = Cell_070()
        x.value = 'myvalue'
        y = copy.deepcopy(x)
        self.assertEqual(y.value, 'myvalue')
        self.failUnless(hasattr(y, '_value'))  # still there
        self.failIf(hasattr(y, '_observers'))  # no longer there

    def test002(self):
        """pre Cell: pickle extracts value alone"""
        x = Cell_070()
        x.value = 'another value'
        p = pickle.dumps(x)
        y = pickle.loads(p)
        self.assertEqual(y.value, 'another value')
        self.failUnless(hasattr(y, '_value'))  # still there
        self.failIf(hasattr(y, '_observers'))  # no longer there

Dependent cells (with derived values) need not be pickled at all. When we get around to evolving the Structure class for holding cells, it will be necessary to determine which cells to leave out. Therefore the following IsDerived method is defined to return True if the value is derived and False if the cell is independent.

class Cell_071(Cell_070):
    """Create method to indicate if cell value is derived."""
    def IsDerived(self):
        """Return True if cell value is derived from refs."""
        return bool(len(self._refs))

    # class Test070 (continued)
    def test003(self):
        """pre Cell: IsDerived"""
        x = Cell_071()
        y = Cell_071()
        z = Cell_071(None, '', x, y)
        self.failIf(x.IsDerived())
        self.failIf(y.IsDerived())
        self.failUnless(z.IsDerived())

Note that this particular implementation uses the length of the _refs list to determine whether or not the cell is dependent. As long as the API is kept the same, we could implement the method in some other way, such as using a flag that is changed when the Calculate method is found and made static. Presence or absence of the Calculate method itself would not suffice, as the base Cell class may be given a noop Calculate method. Moreover, in some situations a potentially dependent cell might be used as an independent cell, so the implementation given here is probably the safest.

3.5.2 Transferring state values

The problem with extracting the essential state (_value) for pickling is that unpickling provides only that state without any of the helper attributes. All of the helper attributes are determined at design-time, however, so if an instance of any given Cell subclass is available, it should have all the proper helper attributes and be lacking only in the proper runtime value. For purposes of this discussion, call the unpickled instance the state and the design-time instance themodel. Depending on whether we prefer that the state pushes its value to the corresponding model, or that the model pulls its value from the appropriate state, one or the other of the following methods will do the job.

class Cell_080(Cell_071):
    """Two more methods for the *variable* interface"""

    def Send(self, destination):
        """Transfer content from self to destination."""
        destination.Set(self.Get())

    def Receive(self, source):
        """Transfer content from source to self."""
        self.Set(source.Get())

class Test080(unittest.TestCase):
    """Transferring values between cells."""

    def test001(self):
        """pre Cell: Send"""
        x = Cell_080()
        y = Cell_080()
        x.value = 'mjunhy'
        x.Send(y)
        self.assertEqual(x.value, y.value)

    def test002(self):
        """pre Cell: Receive"""
        x = Cell_080()
        y = Cell_080()
        x.value = 'mjunhy'
        y.Receive(x)
        self.assertEqual(x.value, y.value)

The key, of course, is making sure the states and models get matched up appropriately, but that issue belongs to the composite models (the Structure class).

3.6 Specialization of cells

The base Cell class is now complete (Cell_080). What remains is to demonstrate how this class can be specialized through two different kinds of mechanisms. Extension is the process by which the base Cell class (or one of its subclasses) is given additional functionality by adding possible constraints, for example, or by adding new methods. This is the kind of specialization that the framework-developer (I, for example) will provide for the user-developer. By contrast, elaboration is the process by which one of the Cell subclasses is specialized further by the user-developer to specify the constraints or to define a specific Calculate method. StringCell_001, for example, is an extension of Cell_080, while MyStringCell is an elaboration of StringCell_001

class StringCell_001(Cell_080):
    """Choices constraint for StringCell"""
    choices = []

    def SetChoices(cls, choices):
        cls.choices = choices

    def GetChoices(self):
        return self.choices

    def Constrain(cls, **kw):
        if 'default' in kw:
            cls.SetDefault(kw['default'])
        if 'Calculate' in kw:
            fnc = kw['Calculate']
            setattr(cls, 'Calculate', staticmethod(fnc))
        if 'choices' in kw:
            cls.SetChoices(kw['choices'])

    Constrain = classmethod(Constrain)
    SetChoices = classmethod(SetChoices)

class MyStringCell(StringCell_001):
    choices = ['papa bear', 'mama bear', 'baby bear']

class Test090(unittest.TestCase):
    """Further constraints"""

    def test001(self):
        """pre StringCell: choices constraint"""
        x = MyStringCell()
        expected = ['papa bear', 'mama bear', 'baby bear']
        self.assertEqual(x.GetChoices(), expected)


if __name__ == '__main__':
    unittest.main()

3.7 Philosophical concerns

What we have evolved here is a Cell class (scalar model) with four different interfaces, a clean, deliberate syntax, and many features essential to the Abstraction-Presentation framework. The principle objection, however, will be that the Cell class is possibly too sophisticated. What does this mean, "too sophisticated"? Well, among other things, it may be objected that the Cell class and its subclasses are too heavy, too rigid, or too tightly-coupled.

3.7.1 Too heavy?

The Cell class essentially defines a variable, that is to say, a wrapper around a value. The objection is that instead of being "thin", it is a "thick" wrapper with far too much overhead. Not every variable needs to be observable, for example, or constrained, and relatively few variables will be dependent. Why, then, should all variables be saddled with the overhead associated with these features?

One answer is that the overhead is actually not that great. The extra methods are defined only once (and exist at only one location in memory), so these clearly are not at issue. It is true that the added instance attributes do take up more memory, but two or three empty lists and two or three other simple attributes hardly seem excessive, even when the total number of cells is large.

It should also be pointed out that in the AP framework, most of the cells actually need to be observable. Furthermore, in some applications, half of the cells or more will be dependent, and the use of constraints is the rule rather than the exception. If these features are not provided in a unified Cell class, they will need to be provided in some other fashion, so you can "pay me now or pay me later." The Cell class does impose substantial overhead on top of a simple variable, it is true, but I contend that although the overhead could possibly be refactored elsewhere, it cannot be eliminated. Furthermore, I believe that by pulling these mechanisms together into one class, greater efficiency is achieved, not less.

A related objection is that in the context of data persistence, the added instance attributes must be stripped from the cell before serialization. This is a technical concern rather than one of principle, however, and one that I believe is adequately addressed above.

3.7.2 Too rigid?

Python is a powerful and flexible general programming language. The Cell class (among others to come) imposses restrictions that at first blush would seem to reduce this power and flexibility. Dynamic typing and dynamic binding, for example, are considered among the most desirable features of Python, whereas the Cell class establishes a static (design-time) data-typing mechanism and a static (design-time) name-binding mechanism. These are high-level mechanisms, however, and at the Python code level, nothing has changed. One is still free to utilize whatever representations of whatever values that one wishes. Moreover, the kinds of type checking allowed by the Cell class provide graceful ways of handling type mismatches.

It may also be objected that the syntax for instantiating cells as well as for defining subclasses of Cell is too restrictive. Defining a dependent cell, for example, requires overriding a specific method using a certain signature and a limited notation. Likewise, specifying constraints for a given subclass of cells requires following a well-defined syntax. But in the first place, such tradeoffs are ubiquitous in software engineering. What one gives up in flexibility through "pseudo-static" typing, for example, may be offset by safer type checking. And what one gives up by adopting a somewhat more limited syntax is offset by ease of coding and better maintainability.

Moreover, the syntax imposed by the Cell class is not gratuitously rigid or arbitrary. At every point, in fact, optimal flexibility has been the goal. Use of the Convert method for data typing, for example, allows more flexibility, not less. Making the Calculate method of dependent cells into a static method, for another, makes it easier to define and test than if it were to remain a bound (instance) method.

3.7.3 Too tightly coupled?

Perhaps the most serious objection to the sophisticated Cell class is that it promotes tight coupling, both internally and externally. Tight external coupling is best demonstrated by the Observable interface, where once an observer is registered with a given cell, the two are tightly coupled. It matters little that a mechanism could be defined to allow the observer to be removed from a given observable. The point is that while the observer is registered with an observable, the two objects are of necessity directly coupled. One way to reduce the degree of coupling (somewhat) would be to utilize a centralized signal dispatcher, but (1) even this would not eliminate coupling entirely, and (2) such a mechanism carries its own set of complications.

An example of internal coupling is the way in which the value property is tied to the observable interface through the Set method. Every time the value of a cell is changed, each observer is notified. This type of coupling could be reduced by removing NotifyObservers from the Set method, but then some other component would have to be responsible for notifying observers by calling this method. There are always trade-offs.

Tight versus loose coupling is an important issue in software engineering. Loose coupling is often touted as provided improved flexibility, reusability, and maintainability. I would argue, however, that GUI applications are by their very nature tightly coupled. The advantage of a GUI application over, say, a Web-based application is that mouse and keyboard events produce instantaneous feedback, potentially across the entire application without to-and-fro communication with a server. This kind of feedback is not always required, but when it is, nothing else suffices. The issue, then, becomes not how to avoid tight coupling, but how to manage coupling such that flexibility, reusability and maintainability are not sacrificed. That is the goal of the Abstraction-Presentation framework.

3.8 Discussion

Although Cell class is of no use without the rest of the Abstraction-Presentation framework, by the same token, the Cell class is the heart and soul of the AP framework. Cells are as essential to the AP framework as they are to a spreadsheet or to a living organism. The variable interface provides the core machinery for managing values. The observable interface allows the world to watch the values as they change. The node interface is the glue needed to hold the cells together properly. And the constraint interface provides a simple notation (syntax) for enforcing the integrity of the values.

Python has been called an agile programming language. At first blush it might seem that the AP framework diminishes this agility by insisting on thick wrappers around values, promoting tightly coupled components, and adopting a special syntax for defining the elements of an application. The answer to this objection is that all four of the Cell interfaces are essential. Nothing less would suffice. True, the necessary functionality could be relocated elsewhere, but it cannot be eliminated.


XHTML 1.0 © 2003 Donnal C. Walter,
This page updated 2003-12-09.
See About this document for information on suggesting changes.