Information Technology > Data Handler 2: Common Problems

Data Handler 2: Common Problems

By SEAN MACKENZIE
Published: June 6, 2008

Welcome to the second article in the Data Handler series. This week we'll be covering common problems that users experience when assembling their data. Usually, this is the office setting, where users are working on projects that involve many different people. There are almost always people who enter data into computers, people who drop off data (like couriers, or people handing in a time sheet), people who review data (supervisors, managers, and analysts), and many other people in various roles. When many people are involved, often data collection issues come up.

Common ProblemsThis Data is Old! One of the most frustrating things is spending hours working from a spreadsheet, only to find out that the data is a week old. Say you were a part inspector for a factory, and you inspected a whole bin full of parts, only to find out that the spreadsheet you worked from was an old one! Your coworker already did that bin yesterday!

Who Has the Right Copy? Another common problem is when people start working on their data entry in parallel, adding and updating information, only to find out they both had the wrong copy. This can create hours of extra work because then the changes either have to be redone on the right sheet, or, even worse, merged onto the right sheet manually.

Corrupted! The probability of office-related files getting corrupted goes up as users leave them open all day long, make thousands of changes, and invite others to use them at the same time. The file handling applications weren't designed for heavy usage like this! I remember seeing an important department spreadsheet that had some rows in the middle of it that had been corrupted because some people were always changing the same few rows. You could scroll down the spreadsheet until the corrupted "garbage" rows, or you could start at the end of the spreadsheet and scroll up to the garbage rows. If you scrolled onto the garbage rows, your system would lock up!!

That's Not What I Put in There!! Have you ever imported a bunch of data from somewhere else and then later discovered that your spreadsheet program changed the data? For example, I once saw a woman who imported ten thousand rows onto her sheet, only to find later that she had a column with Product ID numbers starting with zeroes. So, Product 0012345 became product 12345. When they attempted to match up all the data to other sheets they weren't able to, and had to change the cell properties, then update all the rows to have the correct values.

What You Put in Isn't What I Put In! Almost as frustrating as the above is the case where everyone is updating a sheet but they all spell things different ways. In time, when someone needs to make a report, they discover that a particular name was spelled four different ways, the street addresses had spelling mistakes on some lines, and other kinds of errors. Take the name Sean for example; it can be spelled Sean, Shawn, Shaun, Shon, Sian, and other interesting ways. If nobody enforced one spelling of a particular name, reports would be incorrect in the end.

This is Huge! Office programs are certainly designed to hold a lot of data. However, after a certain point they really become hard to use - and hard to read! One person I worked with managed to get 30,000 rows into a word processor document but then needed to find data based on several columns. It turned out to be extremely tough to find anything, and working on the sheet caused his computer to become unresponsive for long periods. I also saw a spreadsheet with more than 50 columns, and close to 40,000 rows. It was a customer orders application which also had product information in it. Since some products didn't apply to certain customers, the sheet had huge areas with no data. Several people were trying to use it, and it was a mess.

Why Can't We All Use It? Usually, any data that has any worth in an office also has several people (or many people) who want to use it from time to time. They all expect that the data will be up-to-date and ready to use at any given time. However, since office software isn't designed to handle this situation, it is usually the case that people get locked out, so they make additional copies which they update and need to coordinate with the "master" later on. This causes confusion for everyone involved. People are constantly waiting for others to "get out" of the spreadsheet so they can go in and do what they need.

I've Heard of a Database, but..These are just a few example of trouble that people get into while collecting data in their word processors and spreadsheets. In fact, initially collecting data in these programs isn't a bad thing - if you think ahead. If you think about things like where the data will be in six months, or how many people will use it, you'll realize that keeping the data in this format just won't work. If the data is particularly important to your organization, then this preparation could save you and the people around you incredible amounts of time in the future. A software database can:
  • allow data collection and display in real-time
  • allow many users to use data at one time
  • collect huge amounts of data with no problem
  • formalize data collection and organize it efficiently
  • eliminate multiple copies of data in different places
  • avoid corruption problems
  • So, as you can see, common problems could be avoided by using a database for your data collection. This leads us to our first rule of thumb...
    • If you use a Word Processor or a Spreadsheet for data collection, plan ahead so your data can be used in a database.
    • Do not use a Word Processor or Spreadsheet for serious, long term data collection.
    • ... and provides a nice lead in to next week's topic, How to Know When You Need a Database (and How to Get One).

      Any Comments?


      More...

      Journal Space Dies

      By LEONARD MCGAVIN
      Published: January 2, 2009

      Journal Space failed to backup up their data and it cost them the entire site. It's a warning to IT staff everywhere.

      Data Handler 1: The Data Handler

      By SEAN MACKENZIE
      Published: May 14, 2008

      Welcome to the the Data Handler series. This is a series for the benefit of office workers who handle data on a day-to-day basis. In this series we attempt to approach common data problems that non-technical people face in their day-to-day work.