Explore

Build your own data-verse

I am an abstraction person.

I find it very hard to move forward when solving a problem until I have a mental picture of all the moving parts and how they relate. This often leads to much scribbling and crossing out and drawing of boxes and arrows, and strange symbols and notations that make sense when I write them and are completely opaque the next day when I need them!

Working with Coda has been a fantastic mental trainer for me - you have to actually put stuff in tables and then use formulas to pull out the right bits in the right places in the right relations to the right other bits. For a highly abstract set of tools, it forces you to be very concrete, which provides some very important discipline to your thinking. You can fool yourself into believing you understand things when you write notes and sketch diagrams, but when the information you are working with comes “alive” and has to interact to produce the correct results, well... it better be organized, and organized right!

This tool is designed to help me with my problems (/with my OCD mind?) finding structure in data. It (hopefully) will help with finding the structure that exists in the world that you need to understand to organize and manipulate data to represent it. Or it may help you find the order you want to impose on the world in order to be more effective. Either way, I hope it is useful.

How it works:

You can use this doc to construct the rules of your own little universe - the place you plan to use Coda to observe or control. It is designed to help you brainstorm the conceptual structure of your domain, and help you get down the information you might need to help you decide how to set up your tables, look-ups and so on.

1: The concepts:

The design choice made for this doc is to represent everything with two tools:

Categories of objects (object types/classes... see below)

Relations between categories of objects

Categories of objects can be as abstract or as concrete as you need - they are just the things that matter in your use-case. They could be widgets for sale, people, meetings, events, tasks, times, problems, statuses, colors, moods, ideas... whatever. In order for this to work, you need to distinguish between the object types (think general or common nouns, if grammatical categories work for you) and the object tokens (think specific or proper nouns). So you might create a category of object engineer, but you wouldn’t create Jane the engineer (at this stage in the process).

This is, I think, a needed restriction. If you find that you are having trouble capturing all the distinctions you think matter, that is a sign that you need more categories or more relations to other categories to distinguish those differences. By keeping at the level of the general rather than the specific, you are both forcing yourself to figure out your data-verse, and also building for growth. If your categories are right, adding more individuals should be painless, and adding more structure should be too.

Relations between categories of objects are going to carry a lot of weight. (More weight than some of the more philosophically-minded may like - see note below!) We are using it here both to capture the properties of particular objects and also to capture things we might more normally think of a relations between things. See the example ( even further) below!

Philosophical complaints...

If you have any training in logic or related areas, you might already be having problems with this setup. There does seem to be a difference (there IS a difference) between (1) relationships between distinct objects and (2) objects having properties. For the first, consider “Maria owns this car,” where there are two clearly distinct objects. For the second, consider “This car is red,” where red is a property of the car and not a distinct thing. These are, in important ways, not the same. But we are going to treat them the same!

Within this doc, we would set up the second as “Cars have colors.” Then a specific car being related to the specific color red would be “This car is red.” Whatever you think of this as philosophy, I think it works as Coda. If color is a significant property of cars that we need to track, then we are going to end up with something like a column on the car table with a heading “color” and row entries of specific colors. So Coda will need a thing (a column) to deal with color, and we can get away with treating having a color as like having a car, as being a relation to a thing.

A more substantial objection is that not all relations are between two things - consider “Raul gave the car to Abed.” I am just going to ignore this until 2.0, in the hopes that it is not functionally critical to anyone yet, and because I am not sure how to handle such relations in Coda anyway. Consider it an ongoing research project I am willing to revisit if needed...

2: An example:

I am a high school teacher, and one of my goals is to use Coda to improve how we operate. Schools are buried in structured data that is deeply interconnected, yet that data is frequently hard to access and near impossible to manipulate.

Once the data has all been entered in the tables you can generate reports like these:

Select a category:

⁠

1-deep tree:

⁠

2-deep tree:

⁠

[ ]

⁠

A @Teacher ⁠ @belongs to ⁠ a @Department ⁠ or Departments A @Teacher ⁠ @is certified in ⁠ an @Academic Field ⁠ or Academic Fields A @Teacher ⁠ teaches a @Class ⁠ or Classes

⁠

The above version of the report is designed to show that what we have here isn’t just a list of descriptive sentences, but is generated out of the category and relation tables according to the structure that has been input. It can also be displayed in an easier-to-read format:

⁠

A Teacher belongs to a Department or Departments A Teacher is certified in an Academic Field or Academic Fields A Teacher teaches a Class or Classes

⁠

3: Future Development:

For now, this doc is about getting the structure of you data down in a form that is persistent and easy to add to and manipulate. It is a 1.0 version. I would like it to do so much more.

It could clearly use better visualization tools - ways to take the network of categories and relations and show them more clearly. Interfacing with flow-chart tools, etc., would be really cool.

I would like to add sizes to the models - there are roughly this many of these, that many of those, etc. Currently there is some related functionality - you can determine whether the relation is one-to-one, one-to-many or many-to-many, and minimum and maximum values for the number of objects related. But there is no representation of roughly how many entities of each type there are likely to be, and this could affect design decisions.

I would love to be able to build it to give actual concrete advice on database structure - that it have the tools that would take the relations and sizes and lead you to be able to say “This should be my main table with this display column, and I am going to need these tables to support it, linked in these ways.”

I would love it even more if you could use a tool like this to discover the best data structure, then hit a “commit” button and have it generate the tables and look-ups... But that is beyond me right now - if it is not beyond you, please create the doc and share it!

I have not built an intelligent delete/reset mechanism. If you make errors or need to make changes, you will have to go to the data tables and remove rows. Sorry...

The biggest conceptual problems with this doc are, I think, the following:

It doesn’t do a good job of handling relations between objects in the same category - things like “X is a friend of Y.” Right now, the best way to handle that kind of relationship would be “X and Y are in Friend Group 1.” I don’t know enough about database design and I haven’t quite figured out how best to work this in. (Maybe I should use the “Build your own data-verse” doc to brainstorm some answers!)

It isn't (yet) equipped with the structures needed to handle loops in your relations. If A is related to B and B is related to C and C is related to A, the relationship chain would be infinite. This isn’t going to break anything right now, but it would be better if there was a mechanism that would prune the chains.

All of these are 2.0 concerns, and there are many people much smarter than me and much better at Coda who could solve them if they are inclined and think it worthwhile. If you do, please share!

⁠

If you just want to get started, go to:

⁠

Construction Zone Start building your data-verse⁠

⁠

I am an abstraction person.

How it works:

1: The concepts:

Philosophical complaints...

2: An example:

3: Future Development:

Want to print your doc?
This is not the way.

Try clicking the ··· in the right corner or using a keyboard shortcut (

CtrlP

) instead.