Showing posts with label Spreadsheet Management Software. Show all posts
Showing posts with label Spreadsheet Management Software. Show all posts

Tuesday, 23 August 2011

Controlled Use of Excel for BI

1. Introduction

In most Financial Institutions the use of Excel is so embedded that any project to curb this is almost doomed to failure from the start.

This blog is an attempt to elicit discussions and to explore the appetite to address some of the inherent risks with Excel spreadsheets and to see how one can deliver supportable MI using Excel. It should be stressed that this proposal should not be seen as a replacement for strategic tools or processes.

  
2. Benefits and Risks of Using Excel

The use of Excel comes with a multitude of pros and cons. In days past the downsides were considered to be worth the trade-off against the benefits but his is starting to change with the introduction of more stringent governance via legislation such as Solvency II, Basel II and SOX.
Below is a list of benefits and risks of using Excel in your core data streams.


2.1. Benefits
• The skills base for Excel is large and practically every person within a company will be familiar with it.


• Excel is relatively cheap and most users in an organization will have it.

• Excel lends itself to very easily creating financial models in a relatively short time span which normally could not be done in a more structured environment. Less reliance on IT means more flexibility and less time.

• The user experience from creating datasets to graphs is very intuitive and rich.


2.2. Risks
• Any VBA code or macros are specific to the workbook and are not held in a central repository to be reused. This raises concerns about version and change control over the code.

• If the required data set is sourced from any application/database within the IT estate, no way exists to document data lineage or do impact assessments when designing changes to these sub systems.

• Although Excel skills are widespread, most users will develop the model based on easy to develop rather than efficiency of code. No standard way of developing models makes it difficult to support/hand over.

• Any links to external data are also held in the workbook or specific to the PC. Often they contain T-SQL specific to the use-case and this is also not subject to version or change control.

• Excel has restrictions on the amount of data that it can consume. (1 million rows for Excel 2010, 2007 and 65,000 rows for Excel 2003)

• Security to the underlying database is not controlled via a centrally maintained application account but rather based on individual users. This requires users to be added and maintained at Database level.

• Access to spreadsheet models in not restricted via permissions or passwords as they are mostly stored in shared drives accessible to large groups of people.

• Users often save versions of the spreadsheet at various points during a process cycle and between process cycles for archiving. This leads to an explosion of redundant data and cuts of the code being held leading to increased storage costs. This can also lead to confusion as to what is the most recent and correct version. These issues are compounded by users sharing files by e-mailing them to each other.

• The models can become complex in nature and often process large volumes of data. These models can take a long time to run on a PC and in some cases can crash.

• Distributed development and locations lead to key person dependencies.


3. Risk Mitigation

In order to mitigate the risks outlined above one could put the following processes in place. Refer to Annexure A for the risk mitigation matrix.

  
• Set up standards on structure and development of Excel models.

• Ensure all models are developed against these standards, peer reviewed, tested, documented and appropriately transitioned into production.

• Store the models centrally in a controlled and identifiable location.

• Convert macros and VBA into add-ins and store them centrally.

• Implement security around the models for both access and changes.

• Implement appropriate change control and monitoring of the models to ensure that they are not changed without authorization.

In addition to the new or changed processes, technology can be used to mitigate some of the risks. Based on the risk matrix in Annexure B, a combination of Excel Services 2010 with PowerPivot and Prodiance, in theory could mitigate all the identified risks.
Below is a discussion around these technologies and how they could mitigate the risks.


4. Excel Services

Excel Services is the Excel engine delivered through Sharepoint and has the following key advantages:
• As it is delivered via Sharepoint on a server you can leverage the availability of larger memory, many more processors and 64 bit technology. This should reduce the time it takes to run larger and complex data models.

• Security is controlled via standard Sharepoint functionality.

• Centrally used data connections can be used thus eliminating the need for users to be granted direct access to the data.

• When published to Sharepoint one can only make certain parts of the model visible to the users thus hiding any business logic and underlying assumptions, enhancing security even further.

• Automatic scheduling of data updates can be implemented thus making refreshed reports instantly available to the users.

• As workbooks are versioned you can always restore to a previous one should you it be required.
 
Excel services does have some disadvantages listed below:
• Not everything that is native to the desktop version will work in Excel services such as add-ins and certain controls. Also some functionality becomes limited such as Pivot Tables and screen split and zoom functions.

• In order to modify the models you still need to pull them into the desktop version first and then republish them.

• Sharepoint is not a particularly easy technology to deploy and maintain and in order for Excel Services to be installed you have to have the Enterprise version.

5. PowerPivot

PowerPivot is an Excel add-in which has been developed by the Microsoft SQL Server Analysis Services Team and uses in-memory column compression technology branded as Vertipaq. This technology is very similar to that used by such market leaders as Qlikview. PowerPivot allows you to bring in large data sets into Excel and even join these sets to each other and then use the resulting data to report off. It is very efficient at consuming large amounts of data and when used in x64 bit mode with large amounts of memory is quite astonishing in its response.
 
PowerPivot also extends to Sharepoint and when combined with Excel Services in Sharepoint the uses and power suddenly start to become apparent.

6. Prodiance

In one of my blogs a while back I mentioned 4 companies that provide Excel Spreadsheet Control Software. We ultimately decided to back Prodiance as our choice, a decision that has now been ratified by Microsoft.
 
Up until 7th June 2011, Prodiance was an independent company specializing in risk and control software for Excel. It has now been purchased by Microsoft and is a wholly owned subsidiary. With Microsoft now purchasing the company more integration with the Office suite and Sharepoint is anticipated.
Advantages of implementing Prodiance over your spreadsheets are as follows (as per Prodiance pdf):
 
• Electronic sign-off and optional eSignatures

• Email notification of significant or unauthorized changes (e.g. exceptions, policy violations)

• Extensive cell-by-cell, file level and workflow audit trails

• Side-by-side comparison of changes between versions

• Management reports, dashboards and drill-down into detailed reports

• Automated document versioning

• Check-in/check-out (optional)

• Web based access to all historical versions

• Association of parent/child versions and unified view across all audit trails and reports

• Robust document security model integrated with Active Directory/LDAP users and groups

• Permissions to grant appropriate folder and document access levels (e.g. view, add, update, delete, approve, etc.).

• Microsoft Information Rights Management (IRM) encryption for spreadsheets containing sensitive information

• Workbook, worksheet and cell level protection

• Optional lock-down for cell level input control with data validation

• Excel Services for displaying and publishing read-only versions of critical spreadsheets and BI reports

• Extensive cell-by-cell, file level, system level and process level audit trails

• Tracking of changes to key inputs, outputs, spreadsheet data, formulas, macros and queries

• Tracking of changes to queries and data connections to external data sources

• Auditing support for spreadsheets and Access databases

• Optional lock-down for cell level input control with data validation

• Interactive link/dependency diagrams

• Auditing of specific input ranges (including named ranges)

• Automated email alerts upon changes to input ranges

• Color scheme tool to highlight (used and unused) input cells

• Spreadsheet validation and testing via automated cell, formula and range diagnostics

• Proactive identification of spreadsheet development and structural problems

• Comprehensive document and records management

• Enterprise class workflow management

7. Conclusion

In order to achieve the controlled use of Excel within an organization you will need to go through a change in technology, process and culture. I hope this blog has provided some readers with a spark for debate.

Appendix A
 
 
 
Appendix B
 
 
 

  

  

  

Saturday, 29 August 2009

A Novel Way of Dealing with Excel Models

Excel is both heaven and hell in the business world. It's a great calculator, modeler, reporter, you name it. With this incredible flexibility comes a trade off, your data is stored in multiple locations throughout the company in an unstructured and often unsecured manner. This data is not visible to the Enterprise and in most cases is not integrated into your reporting stack. Loads of capturing and data copying results in huge inefficiencies and in many cases errors.

My current employer embarked on a project 18 months ago to "black box" a number of these high profile models. Without going into the gory details let's just say that it did not work. I joined them in April of this year when my family and I made the move to London and was tasked with firstly evaluating the models to give an opinion on whether they were fit for purpose and could they be supported into the future. Sadly they could not and it was my unfortunate responsibility to stop the bleeding and bring the project to a conclusion. That was the easy part, the hard part is what do you do now as the original issues around the Excel Models were still relevant.

The obvious choice would be to take a look at the short listed products again that were selected at great expense and find the next suitable candidate. I was not comfortable with this approach and we did not want to try and rebuild them in PerformancePoint as all development for the product has been discontinued by Microsoft. (Don't even get me started on this subject, I have bitten my tongue since they shook the market with this announcement in January 2009).

I have come up with a rather "out of the box" approach which my team is in the process of designing at the moment. We needed to achieve a few objectives from the project being:
  • The models should be flexible
  • They should have access control
  • Changes should follow appropriate change control
  • They should be auditable
  • The data should be stored centrally and available to other business systems
Package these requirements up with the new Solvency II requirements and SOX etc, and you have quite a tall order.

Off course the most flexible model is still Excel. So what if you could keep Excel and also apply security, access control, change control, auditing etc. This led me into a journey of trying to find software that would give us this functionality. And guess what, it does exist, albeit a software concept in its infancy. Gartner have even released a paper on it (March 11, 2008) under the heading "MarketScope for Spreadsheet Control Products, 2008". It was quite comforting to find out that we had made contact with all the major players prior to this report and that our findings were in line with Gartners. Here are the major players in this space:
They all have a presence in London. Cimcon and Prodiance are US based suppliers whereas ClusterSeven and Finsbury are UK based.

Now they all do pretty much the same thing but in slightly different ways. Essentially each product has 3 major functionality groups:
  • Discovery and Risk Assessment - Basically trolling through all your locations looking for spreadsheets, categorising them and creating a baseline.
  • Monitoring - Keeping track of your spreadsheets and keeping a full audit trail of all changes and versions.
  • Development Tools - Some are add-ins, others applications that assist in developing your spreadsheets to minimise risk, errors etc.
So in a nutshell you can see who did what to which spreadsheet over time and set up alerts and workflows etc. You will be able to see broken links between spreadsheets and a whole host of interesting things that are out of the scope of this paper. We are in the process of getting RFP's from the suppliers and although I have some opinions will reserve them until a later date.

This piece of software should take care of most of the issues surrounding the Excel Models. Now for centrally storing the data and making it available to the rest of the business. My team is in the process of requirements gathering for a framework that will allow us to upload many different data sets from these models into a SQL Server database and thus make it available to our reporting layer. This may sound quite simple but to build a framework that is scalable. supportable and flexible is no easy task.

Watch this space!

- Paul Steynberg