Comments on: Why Correct Data Entry Is Important https://thomaslarock.com/2009/07/why-correct-data-entry-is-important/ Thomas LaRock is an author, speaker, data expert, and SQLRockstar. He helps people connect, learn, and share. Along the way he solves data problems, too. Sun, 27 Nov 2011 05:25:09 +0000 hourly 1 https://wordpress.org/?v=6.7.1 By: Thomas LaRock https://thomaslarock.com/2009/07/why-correct-data-entry-is-important/#comment-782 Fri, 17 Jul 2009 11:47:16 +0000 http://thomaslarock.com/?p=2316#comment-782 In reply to Scott Holewinski.

They would have been better off hiring someone to design a better system.

]]>
By: Scott Holewinski https://thomaslarock.com/2009/07/why-correct-data-entry-is-important/#comment-781 Thu, 16 Jul 2009 18:35:47 +0000 http://thomaslarock.com/?p=2316#comment-781 My last company simply hired an IT guy to stay up late in case any bad data crashed the overnight processing.

]]>
By: Thomas LaRock https://thomaslarock.com/2009/07/why-correct-data-entry-is-important/#comment-780 Mon, 06 Jul 2009 18:21:12 +0000 http://thomaslarock.com/?p=2316#comment-780 In reply to Brett Flippin.

Thanks Brett, you raise some very good points to consider, including the experience level of the end user.

]]>
By: Brett Flippin https://thomaslarock.com/2009/07/why-correct-data-entry-is-important/#comment-779 Mon, 06 Jul 2009 14:08:04 +0000 http://thomaslarock.com/?p=2316#comment-779 Personally, I always try to cover this during my requirements gathering follow up meetings, after we’ve identified the data we’re going to be moving.

This should allow the business users to give you some kind of rules that you can implement in the ETL process to either bin or correct bad data. I’m not a big fan of correcting it though as separating out bad data and notifying a business user that it needs to be corrected is a good way to get your users to start being more responsible for the data they input.

Another rather novel approach if your users are less than knowledgeable about what good and bad data can look like is to create a clustering mining model with the record set you’ll be moving in the ETL process. This could, in theory, identify data that “doesn’t fit” in your record set based on the data you’ve already collected.

Of course, the best thing to do is design the data collection front end with good enough data validation that none of these measures would be necessary, but that is unfortunately not an option most of the time especially when dealing with vendor software.

]]>