Take a look at this set of addresses:
The time it would take you to parse out the addresses from a few thousand row could be better spent doing… well just about anything else (sleeping would be my choice, or maybe watching sports center). Using the Address Parse Transform will help free up that time. And here is how you configure it
Once you have connected your source data to the Task Factory Address Parse Transform you need to tell it what the address column or columns are. That’s right, it doesn’t need to be in one column it can be in multiple columns and it will still separate the address into the standard fields. This first example will use one column. Select the appropriate column from the dropdown.
Once that is done select the Address Quality Tab.
Once parsed the addresses are assigned a quality by the Address Parser Transform in two sections of the address.
· Address Line 1 and Address Line 2
· City, State and Zip
There are three levels of quality that can be assigned to the two sections of an address
· High
§ For address line 1 and address line 2, high quality indicates that the address is correct
§ For city, state and zip, high quality indicates that the city, state and zip correct
· Medium
§ For address line 1 and address line 2, medium quality means the address is complete but missing a part of the address like apartment number or street suffix
§ For city, state and zip, high quality means either the city, state or zip code is missing.
· Low
§ For address line 1 and address line 2, medium quality means the address is not verifiable.
§ For city, state and zip, high quality means more than one of the city, state or zip code is missing.
When the address parse runs it will parse out the data and assign the level of quality based on whether it can find a valid address and a valid city, state and zip according to the rules above. You can either include all of the address as part of the output from the transform or you can choose "Include only high quality parsed addresses." By selecting "Include only high quality parsed addresses" the output named "Parsed Address Output” will contain addresses that are considered high quality. You can then output the other rows to a separate table that will require more analysis, or simply disregard them all together.
You can also specify that some Null handling here as well.
Finally on the Parsed Output Tab you can specify how to handle the newly parsed data. You can:
§ Add New Column
§ Replace Column
You can also specify the data type and the length of each column as well as the code page. Finally you can also specify the casing of the data.
Now just hook it up to your destination and watch the Task Factory Address Parser Transform go to work.
If you get a chance take a look at Task Factory it has 26 transforms designed to assist your development. Good Luck to you all.
No comments:
Post a Comment