OBJECTIVE
- Transform data into information.
- Use statistical techniques and tools to reduce the number of metrics that need to be monitored and analyzed.
- Apply regression analysis to determine the scalability of web applications and services.
Who Should Attend?
Computer system administrators, mainframe system operators, network system administrators, performance engineers, test engineers, IT consultants, data center managers, Devops, IT technical managers and software development engineers. This course does not assume any prior experience with performance analysis methods, but a working knowledge of computer systems and high school algebra is helpful.
Outline
- How to Detect Bad Data
- All data is wrong by definition
- Broken performance tools
- The power of good statistical models
- Introduction to R
- Why R is de RigueuR on Wall St and elsewhere
- My special 911.r script
- R commands
- R language
- R graphics
- Installing R
- Measurement Errors and Analysis
- Measurement is a process not a number
- Confidence intervals and sigma levels
- Confidence bands and QQ plots
- How to express errors
- Review of Elementary Statistics
- Descriptive statistics
- Measures of central tendency: mean, median and mode
- Meaning of the means: arithmetic, geometric, harmonic
- Measures of dispersion: stdev, variance, stderr, percentiles
- Summarizing data and its statistics
- Distributions and Histograms
- Review of Uniform, Normal, Poisson, Exponential distributions
- How to determine normal distributions
- How to determine exponential/Poisson distributions
- Weighted multi-class workloads
- Review of Benchmarking and Load Test Tools
- History of industry benchmarks SPEC and TPC
- Steady-state measurement period
- Comparing vendor benchmarks
- Scalability Analysis
- Load test data and QA analysis
- Universal scalability law
- Analyzing data for scalability zones
- Multivariate Linear and Nonlinear Regression
- ANOVA: Analysis of Variance
- Moving averages
- Web server scalability
- Web traffic profiles and TZ zones
- Data Mining Techniques for CaP
- Machine learning algorithms
- Support Vector Machines
- Supervised learning
- The svm package in R
- Detecting performance patterns and defining exceptions
- Wild Not Mild Data Distributions
- Power law data and distributions
- Case studies: SQL access patterns, web traffic, data recovery
- Data validation using qqplots, log-linear plots and log-log plots
- Taming the Data Torrent
- Principal component analysis
- Reducing the number of monitored metrics
- Case studies: PerfViz, Apdex, Barry
- PDQ-R Queueing Modeling Tool
- The statistics of queues
- Case study: Modeling networked storage
- Case study: Multi-tier e-commerce data and PDQ analysis
- Review and Class Discussion
Certificates
A Certificate of Completion will be issued to those who attend & successfully complete the programme.
Schedule
08:30 – 10:15 First Session
10:15 – 10:30 Coffee Break
10:30 – 12:15 Second Session
12:15 – 12:30 Coffee Break
12:30 – 14:00 Third Session
14:00 – 15:00 Lunch
Training Methodology:
This interactive training course includes the following training methodologies as a percentage of the total tuition hours:
- 30% Lectures, Concepts, Role Play
- 20% Workshops & Work Presentations, Techniques
- 20% Based on Case Studies & Practical Exercises
- 10% Videos, Software & General Discussions
- 20% Application
- Pre and Post Test
Fees
The Fee for the seminar, including instruction materials, documentation, lunch, coffee/tea breaks & snack is: