If you are an university using sfdc and if you have a data warehouse, student information system like Banner, one of the challenges is to figure out a strategy for deduping . The problem is that you may not know whether to do it on sfdc, or an in between data hub layer or how do we ensure that we build a scalable solution where deduping can take place for both prospective and registered students and still leverage the deduping power of sfdc? This article will help you to make a decision on this and make the right decision by considering the right trade offs.
1. Dedupe everything in SFDC
One of the options to consider is to do deduping in SFDC and have SFDC to be the system of engagement. The deduping can be done using app exchange tools like ring lead, Demand tools etc which have sophisticated algorithms pre built to identify duplicates.
1. By doing deduplication in SFDC, deduplication effort can be done quicker using the app exchange tools.
2. This will work for external file lists like SAT scores, prospective student list which can be imported from external systems easily into salesforce. This would ensure accurate tracking of marketing campaigns and prevent duplicate prospective student data.
3. This approach can help to track marketing activities done for existing registered students, identify cross sell opportunities with existing students easily.
1. This approach will only be effective if all the student data from your student information system is migrated to salesforce. This would ensure that deduplication takes place between prospective and registered students and no duplicate student data is maintained.
2. First time deduplication effort will take lot of time.
3. This would also mean that any prospective student entry from your portals, mobile, events should be entered in sfdc first before it goes to your SIS system.
2. Dedupe in the HUB.
If you use a data warehouse and the data is flowing to the warehouse from your SIS system, then you can consider an inbetween database called the HUB which can be used to merge different student data from SFDC, SIS and other systems. This hub can be used to do deduplication across the different student data.
1. If you use informatica or any other ETL tools, these tools can send the data from different systems to the HUB and deduplication can take place in the hub. This can be advantageous if you have already built prebuilt deduplication modules in SIS and this can leverage it easily.
2. You do not need to migrate all of your student data to the SFDC and minimizes storage costs in SFDC.
1. Deduplication rules have to be custom built and updated on regular basis with addition of new student list and critieria’s.
2. Higher maintenance cost.
3. Hybrid approach of Deduplication in SFDC and HUB.
In this approach, deduplication will be done both in SFDC and in the HUB leveraging the power of both the above approaches. Any suspect or prospective student will be uploaded to SFDC and deduplication will be done using app exchange tools. For existing registered students, deduplication will be done in the Data hub leveraging existing deduplication modules.
1. Higher data quality as deduplication is done both in SFDC and in the HUB.
2. Faster to implement as SFDC will leverage app exchange tools and HUB will leverage existing solutions built to handle duplicates.
1. Prospective students should be maintained in SFDC and only pushed to SIS once a registration takes place for the students.
2. One time data migration cost for prospective student to SFDC.
So based on the above 3 approaches, universities can come up with a data quality strategy to prevent duplicates on their student data.
1. If you use SFDC alone for deduping, make sure you move all the registered student data to SFDC to ensure data quality.
2. If you use data hub for deduping, ensure that prospective student data is loaded to the hub from all points of entry of the student.
3. Hybrid solution would work great if you have a clear separation of prospective students in SFDC and registered students in HUB.
What approach does your university use for data quality in SFDC ? Please feel free to post your comments or thoughts on this blog and feel free to email me at firstname.lastname@example.org for further questions.