Wednesday, July 11, 2012

A litter is a bunch of cat(egorie)s...

For some reason, I started thinking about organizing a bunch of 'anything'.  That probably grew out of my previous post about what programming is - but I can't be sure.

There are thousands of apps that slice into that pie of organizing something (your CD library, sock drawer, shopping list, etc).  But, as far as I know, there isn't one that will organize anything.

And, that led me to consider what it would take to make such a thing.  I did not go the next step to wonder how many people would want an app that would be generic enough to allow them to catalog their collection of stamps in the same place as they cataloged their collection of CDs, shirts, socks, etc.  Neither will I start trying to figure that out now - instead, I will go through the exercise of building it.

If you are interested in watching the app evolve, stay tuned.

Phase one: Figure out what your data is...

As I go through this, I will be designing the data structures to hold the information.  Since I am most familiar with SQL, I will be using tables (in my head, they are mySQL tables - but that is just in my head).  Also in my head, every table should have an id column that has no intrinsic meaning.  If your table has a name or description (or any other meaningful piece of information) as its primary key, things will get messy if you ever want to change that column.

Since this is intended to be a program to allow you to catalog anything - there is no way for me to determine exactly what the information being stored will be.  So, instead, I need to figure out a way to describe an 'anything' and group together all of the attributes that are necessary to describe one.

Aha!  An 'anything' has attributes that describe it!  I'll need an 'attribute' table.  There will not need to be much to it though.  Just the name of the attribute and the type of information that it will hold.

Crap - I need an attribute value type table too.

attributeValueType (
id bigint auto_increment primary key,
valueTypeDescription varchar(255),
enterValue boolean default true,
multiValue boolean default false
)

This table has two boolean (true or false) columns.  This took a little bit of thinking ahead.  And the reason for them is this.  An attribute will either allow the user to 'select a value from a list' or 'enter a value'.  Which of these two applies will be indicated by the value of the enterValue column.  If the value is 'true' then the user will be able to enter any value that they want.  If it is false, they they can only select from the list of options that the attribute permits.  The second boolean indicates whether or not it is possible to have more than one value for the attribute.  An example of an attribute that allows only one value might be the 'color' of a crayon.  But, a sweater might have several values for the 'color'.

You might be looking at this table and thinking to yourself that even though we have this nifty table, there are only four possible combinations of enterValue and multiValue.  And, you would be right.  Since you would be right, it would be an option to simply hard code these four values and forget about having the table altogether.  I hate hard coding if I can avoid it though.  If you decide to implement this system - feel free to get rid of this table.

But, there is a method to my madness... Keeping this in a table will allow us to extend the system later when we realize that there are more kinds of information we might want to keep track of (like dates or currency amounts for example).  For now, I am keeping things simple.

Here are generic versions of the four records that would go into the attributeValueType table:
  • ('Enter Single Value', true, false) - gets an id of 1
  • ('Enter Multiple Values', true, true) - gets an id of 2
  • ('Select Single Value', false, false) - gets an id of 3
  • ('Select Multiple Values', false, true) - gets an id of 4
Now, we can define our attribute table.

attribute (
id bigint auto_increment primary key,
attributeDescription varchar(255),
attributeValueTypeID bigint
)

Now we are starting to get somewhere interesting.  We can begin to think about one particular thing and how to describe it.  i.e. What are its attributes?  To help think about this, we should probably pick something to be our 'anything'.  As long as we keep the goal of a system that can keep track of 'anything' it won't hurt to narrow things down while we are designing/building it.

I will pick crayons - because there are not very many attributes to keep track of.  Actually, there are lots of attributes of crayons.  Most are things that would usually be ignored (diameter, wax type, melting point, flash point, weight, wrapper material, etc) with only a few that would 'usually' be kept track of (color, length).

So, lets make entries for the attributes to describe a crayon:
  • ('Color', 3) - gets an id of 1
  • ('Length', 1) - gets an id of 2
Here is the rational behind these choices for the attributeValueType.  The color will be a single value (most crayons have only one color) that will be picked from a list.  The length will simply be entered in.  It would be possible to change the length to a 'select from list', but in this case - it would probably be silly to try to enumerate all of the possible lengths that a crayon could have (3", 2.99", 2.98", endless silliness..., 0").

Wait a second.  We don't have a place to hold those color choices.  Time for another table.  Now, these values will need two things.  They will need to know what attribute they belong to (attributeID) and they will need the actual value that they are (valueData).

attributeValues (
id bigint auto_increment primary key,
attributeID bigint,
valueData varchar(255)
)

Since we are here, we will go ahead and enter the color choices that we will need for our crayons (I'll pretend there are only the seven colors of the rainbow for our crayons):

  • (1, 'Red') - gets an id of 1
  • (1, 'Orange') - gets an id of 2
  • (1, 'Yellow') - gets an id of 3
  • (1, 'Green') - gets an id of 4
  • (1, 'Blue') - gets an id of 5
  • (1, 'Indigo') - gets an id of 6
  • (1, 'Violet') - gets an id of 7

Each of those got an attributeID of 1 because they all belong to the attribute 'Color' that we created earlier.

Ack.  That is an awful lot of work considering that we didn't even get to making a 'thing' yet.

We'll take a break for now.  But come back!

Next time, we will finally make a cat(egory).

Monday, July 2, 2012

What is programming...

The non-programmers that I know usually do not have a particularly good idea of what programming is.

I know that none of the people that I have in my circle of people really know what I do for a living.  Most seem to think that I fix hardware.

To be clear, I am mostly going to talk about programming from a utility or business standpoint.  But even games follow the same principles (although it might not be as obvious).

Programming is a natural thing that people do all of the time - but usually, it is 'throw-away' coding.  They figure out how do perform a task, perform the task, and forget about the whole thing until (unless) they need to do it again.

Computer programming requires something more regimented.  We (you) are figuring out how to explain to a computer exactly how to perform a task.  Because computers are very good at doing exactly the same thing over and over again (but not good at filling in the blanks if you skip a step) it is important to include every necessary step.

Good computer programming would require making those steps simple.  Computers can follow complicated steps, but when you are trying to read your work in the future (or someone else is trying to read it) simple will be easier for you.  And a good practice would be to make sure that not only are your steps as simple as possible, but that you make them self describing.  Name variables so that the tell you what they hold.  Put in comments.  Include logging that makes your program tell you what it is doing as it is doing.

Great programming adds in speed and elegance.

There is no one way to add speed and elegance.  I think that mostly, they will come from practice at trying to be a good programmer.

I think that every program is made out of a small number of 'major' parts.

  • Data
  • Presentation/Input
  • Processing
The data is the information that your program has to work with.  Whether it is temperature and wind measurements for a weather modelling program - or the amount of money in your bank account along with your deposits and expenditures...data is the whole reason for writing a program.  It is what you want to keep track of and the building blocks that you need to solve a problem.

Presentation is how you show your data (and results) to your user.  Input is how you get the source data from them.  Ugly screens or complicated interfaces will make it more difficult for someone to use your program.  It may be the best at what it does, but if it is painful to use...no one will ever know that.

And processing is all of the work mixing the data together and applying various rules against it.  If you have the best data and the most pleasant, understandable presentation...It will not matter if you program does not do something useful (and correct) with it.

From the order that I listed them, you would probably infer that I think the data is the most important part of the puzzle.  And, you would be right.  If your data is wrong or you are keeping track of the wrong information - your system is never going to be useful.  So, the first step in any programming exercise should be to figure out what your data is.

But, whatever you come up with will probably be wrong.  With practice, you will get closer to being initially right (so keep practicing) but you will need to keep an open mind as to what needs to be kept track of.  And also, what your user will need back from you.  As you move through the process both sides of that equation will (hopefully) become clearer and clearer.  As changes make themselves obvious - adjust.

The next part of programming is figuring out how you will get that starting information from the user, how you will show them what they told you, and how you will give the results back to them after they have been processed.  You will also need to figure out how the processing actions are going to be started.  Will the user press a button on screen, set a timer, etc...?  A well designed user interface is one of the things that can make or break a program.

And finally, what processing needs to be done?  There needs to be a payoff for your user otherwise, they will not go through the bother of starting your program (never mind taking the time to enter all of their data into it!).  This is where the meat of programming is.  Someone has a problem that they want solved.  It is a problem that requires keeping track of a lot of information or performing difficult manipulation of information (or both).  Sometimes your user would be capable of doing that work themselves and sometimes they will not be.  You will need to make it 'cheaper' (in time, stress, accuracy) for them to use your program than to do the work themselves.

This is a little more than strictly 'programming'.  Working through all of these would encompass system analysis, system design, interface design, data processing, as well as the programming part.  If you have a team to work with and each person takes a piece - that might make it easier for each to become expert at the part they do.  But I would say that being able to see the big picture and all of the parts is what truly makes creating programs fulfilling.

Thoughts?