Data Integration – Data Lake – Business Intelligence

    Data Integration – Data Lake – Business Intelligence

    Scoping Analysis Data Collection
    Scoping Analysis Overview

    This document is to be used in preparation for the Data Integration/Data Lake/Business Intelligence discovery workshop and/or the over-all Solution proposal. It contains a list of questions designed to help uncover factors relating to the success of the planned project. The answers that will be provided shall serve as basis for the drafting of the workshop agenda and/or proposal solution. This questionnaire is aimed to help us identify the organization’s pain points and particular measures of success that will be visible within the organization.

    People/Process Questions

    What defines success for the Data Integration/DW/BI/Data Lake (please check all that apply)?


    Who are the stakeholders?

    Specific function group/s:

    Which roles and how many users will there be, e.g. Data Scientists, Business Analysts, Report Analysts?

    Will you need support for the project after launch?

    Business Requirements

    What business capabilities or problems is the DW/Data Lake/BI going to help solve, which pain points might it eliminate, and what business outcomes might be expected?

    Do the business requirements include any real-time or near real-time prescriptive analytics?

    What are the typical questions users would like to answer?

    Data Preparation Requirements

    To help understand the data preparation requirements for the project, kindly provide the comments/ remarks in the table below:



    Will Data Cleansing – cleanup of data based on specified rules (i.e. de-duplication, establish consistency, correction based on known data values) – be required?

    Where will the business rules/calculations be defined? Data Source or proposed solution?

    Where will the aggregations be stored? Data Source or proposed solution?

    Please specify number of measures or KPIs (approximate)

    Please specify number of dimensions (approximate)

    Data Ingestion Requirements

    To help estimate the effort for the project, kindly provide the details for the identified data sources that will be ingested into the DW/Data lake/Data Repository in the table below:

    First row is given as an example:

    Data Source/Application

    Relational DB and version

    Estimated number of relational tables

    Estimated Data Size (Total)

    Estimated Incremental Size (daily/monthly)

    POS System

    MS Sql Server 2012



    200 GB - Monthly

    Reporting and Dashboard Requirements

    What applications are planned to be downstream to the DW/Data Lake? E.g. Reporting Applications, Machine Learning Models, Operational Reporting Systems, Other Data Marts?

    If there are pre-identified reports/graphs that are pertinent to be the direct result of the solution, kindly specify in the below table the details for each. First row is given as an example.

    Report Name

    Type of Report


    Frequency of refresh

    Daily Sales Report


    Sales per product

    Total Sales

    Users and Groups

    Please specify the total number of users of the system. Users with Edit/View permissions can Create, View and Share the dashboards and reports with other BI Users. Add rows as necessary. The first row is given as example.

    Type of User



    Number of Users

    Location (Head Office or Branch, can access anywhere)

    Connection Speed

    Super User

    MS Power BI



    Can access anywhere

    50-100 MBPS

    Please specify if the following features are required. Add additional capabilities as necessary.


    Required/ Not Required

    Manage access control and sharing through Active Directory Groups

    Control data access with row-level security for users and groups (report level)

    Others (please specify)

    Specify number of groups to be created (if required):

    Solution Deployment



    What is the preferred delivery of the solution?

    Hybrid (On-Premise + Cloud)

    On-Premises Only through File Share

    On-Premises Only through SharePoint

    On-Premises Only through Intranet

    Mobile Devices


    Third Party Integration

    Cloud Service