Data Integration – Data Lake – Business Intelligence

    Data Integration – Data Lake – Business Intelligence

    Scoping Analysis Data Collection

    Scoping Analysis Overview

    This document is to be used in preparation for the Data Integration/Data Lake/Business Intelligence discovery workshop and/or the over-all Solution proposal. It contains a list of questions designed to help uncover factors relating to the success of the planned project. The answers that will be provided shall serve as basis for the drafting of the workshop agenda and/or proposal solution. This questionnaire is aimed to help us identify the organization’s pain points and particular measures of success that will be visible within the organization.

    People/Process Questions

    What defines success for the Data Integration/DW/BI/Data Lake (please check all that apply)?


    Who are the stakeholders?

    Specific function group/s:

    Which roles and how many users will there be, e.g. Data Scientists, Business Analysts, Report Analysts?

    Will you need support for the project after launch?

    Business Requirements

    What business capabilities or problems is the DW/Data Lake/BI going to help solve, which pain points might it eliminate, and what business outcomes might be expected?

    Do the business requirements include any real-time or near real-time prescriptive analytics?

    What are the typical questions users would like to answer?

    Data Preparation Requirements

    To help understand the data preparation requirements for the project, kindly provide the comments/ remarks in the table below:

    Capability Comment/Remarks
    Will Data Cleansing – cleanup of data based on specified rules (i.e. de-duplication, establish consistency, correction based on known data values) – be required?
    Where will the business rules/calculations be defined? Data Source or proposed solution?
    Where will the aggregations be stored? Data Source or proposed solution?
    Please specify number of measures or KPIs (approximate)
    Please specify number of dimensions (approximate)

    Data Ingestion Requirements

    To help estimate the effort for the project, kindly provide the details for the identified data sources that will be ingested into the DW/Data lake/Data Repository in the table below:

    First row is given as an example:

    Data Source/Application Relational DB and version Estimated number of relational tables Estimated Data Size (Total) Estimated Incremental Size (daily/monthly)
    POS System MS Sql Server 2012 200 1TB 200 GB - Monthly

    Reporting and Dashboard Requirements

    What applications are planned to be downstream to the DW/Data Lake? E.g. Reporting Applications, Machine Learning Models, Operational Reporting Systems, Other Data Marts?

    If there are pre-identified reports/graphs that are pertinent to be the direct result of the solution, kindly specify in the below table the details for each. First row is given as an example.

    Report Name Type of Report Metrics/KPIs Frequency of refresh
    Daily Sales Report Tabular Sales per product
    Total Sales

    Users and Groups

    Please specify the total number of users of the system. Users with Edit/View permissions can Create, View and Share the dashboards and reports with other BI Users. Add rows as necessary. The first row is given as example.

    Type of User Tool Permissions Number of Users Location (Head Office or Branch, can access anywhere) Connection Speed
    Super User MS Power BI Admin 2 Can access anywhere 50-100 MBPS

    Please specify if the following features are required. Add additional capabilities as necessary.

    Capability Required/ Not Required
    Manage access control and sharing through Active Directory Groups
    Control data access with row-level security for users and groups (report level)
    Others (please specify)

    Specify number of groups to be created (if required):

    Solution Deployment
    Capability Comment
    What is the preferred delivery of the solution?
    Hybrid (On-Premise + Cloud)
    On-Premises Only through File Share
    On-Premises Only through SharePoint
    On-Premises Only through Intranet
    Mobile Devices
    Third Party Integration
    Cloud Service