Correct term please
I need some help with the correct term for something. If you have an application collecting data from multiple sources, you want the application to manipulate the data such that it all uses the same terminology/codes/scale/units. You do this so you can use the data from different sources together and eventually do some data mining. My first instinct is to call this "normalization". I think that's what a scientist would call it but then "normalization" has a more specific meaning within databases. I am also thinking "standardization" but I'm not sure that captures the whole idea.
name withheld out of cowardice
I don't know if this is the correct term, but we call it data validation. It involves collecting data from multiple sources and validate/fix it so that it can be fed to the warehouse.
We called it data cleaning or data transformation.
The process you're describing has a well known acronym, ETL: Extract, Transform, and Load. It describes the generic process of taking data from a variety of sources, transforming the data so that it all gels well together, and then load it into a secondary data source.
Brad Wilson (dotnetguy.techieswithcats.com)
Homogenize? Standardize? Regularize? Normalize? Marshal?
You'll probably find:
What you are creating is a Data Warehouse to integrate and standardise your data. Then you need an application to access the DW.
This is orthogonal to the OP but it reminded me of it.
The OP's first instinct was on the right track: Enterprise Integration Patterns defines this type of process as a "Normalizer." Of course, that's completely different than the RDBMS notion of "normalization."
The people on my team sometimes refer to this process as "canonicalization".
I'd call it "data fusion", although that's a term more common in military applications for what you're doing.
Another vote for ETL. At least that is what it is called at our shop.
Fog Creek Home