General Mathematics
Mathematical Investigation
Modelling with Matrices
Using data from a sporting context to rank teams
Dominance matrices can be useful in round robin sporting competitions. The dominance model can be used to make predictions based on current season performances about which team might win a competition. You will choose a sport or competition played in a ‘round robin’ format where results can be easily obtained for a completed season so the final ranking of the teams is known.
(Note: the word ‘round’ can be used in many senses in sports, however in this task the term ‘round’ is taken to mean that each team has played every other team once.)
The investigation report should be a maximum of 12 single-sided A4 pages if written, or the equivalent in multimodal for Report Format
The report may take a variety of forms, but would usually include the following:
A bibliography and appendices, as appropriate, may be used.
The format of an investigation report may be written or multimodal.
Performance Standards for Stage 2 General Mathematics
- | Concepts and Techniques | Reasoning and Communication |
A | Comprehensive knowledge and understanding of concepts and relationships. Highly effective selection and application of mathematical techniques and algorithms to find efficient and accurate solutions to routine and complex problems in a variety of contexts. Successful development and application of mathematical models to find concise and accurate solutions. Appropriate and effective use of electronic technology to find accurate solutions to routine and complex problems. | Comprehensive interpretation of mathematical results in the context of the problem. Drawing logical conclusions from mathematical results, with a comprehensive understanding of their reasonableness and limitations. Proficient and accurate use of appropriate mathematical notation, representations, and terminology. Highly effective communication of mathematical ideas and reasoning to develop logical and concise arguments. Formation and testing of appropriate predictions, using sound mathematical evidence. |
B | Some depth of knowledge and understanding of concepts and relationships. Mostly effective selection and application of mathematical techniques and algorithms to find mostly accurate solutions to routine and some complex problems in a variety of contexts. Attempted development and successful application of mathematical models to find mostly accurate solutions. Mostly appropriate and effective use of electronic technology to find mostly accurate solutions to routine and some complex problems. | Mostly appropriate interpretation of mathematical results in the context of the problem. Drawing mostly logical conclusions from mathematical results, with some depth of understanding of their reasonableness and limitations. Mostly accurate use of appropriate mathematical notation, representations, and terminology. Mostly effective communication of mathematical ideas and reasoning to develop mostly logical arguments. Formation and testing of mostly appropriate predictions, using some mathematical evidence. |
C | Generally competent knowledge and understanding of concepts and relationships. Generally effective selection and application of mathematical techniques and algorithms to find mostly accurate solutions to routine problems in different contexts. Application of mathematical models to find generally accurate solutions. Generally appropriate and effective use of electronic technology to find mostly accurate solutions to routine problems. | Generally appropriate interpretation of mathematical results in the context of the problem. Drawing some logical conclusions from mathematical results, with some understanding of their reasonableness and limitations. Generally appropriate use of mathematical notation, representations, and terminology, with reasonable accuracy. Generally effective communication of mathematical ideas and reasoning to develop some logical arguments. Formation of an appropriate prediction and some attempt to test it using mathematical evidence. |
D | Basic knowledge and some understanding of concepts and relationships. Some selection and application of mathematical techniques and algorithms to find some accurate solutions to routine problems in context. Some application of mathematical models to find some accurate or partially accurate solutions. Some appropriate use of electronic technology to find some accurate solutions to routine problems. | Some interpretation of mathematical results. Drawing some conclusions from mathematical results, with some awareness of their reasonableness. Some appropriate use of mathematical notation, representations, and terminology, with some accuracy. Some communication of mathematical ideas, with attempted reasoning and/or arguments. Attempted formation of a prediction with limited attempt to test it using mathematical evidence. |
E | Limited knowledge or understanding of concepts and relationships. Attempted selection and limited application of mathematical techniques or algorithms, with limited accuracy in solving routine problems. Attempted application of mathematical models, with limited accuracy. Attempted use of electronic technology, with limited accuracy in solving routine problems. | Limited interpretation of mathematical results. Limited understanding of the meaning of mathematical results, their reasonableness or limitations. Limited use of appropriate mathematical notation, representations, or terminology, with limited accuracy. Attempted communication of mathematical ideas, with limited reasoning. Limited attempt to form or test a prediction. |
Modelling with Matrices
Using data from a sporting context to rank teams
Introduction
This report is a detailed study on how dominance matrices can be useful in sporting competitions. The dominance model can be used to make predictions based on current season performances about which team might win a competition. This report provides information obtained by applying dominance matrices on a sport which has different teams with their given scores till a certain round and then predicting the result of the rounds which will happen in future and hence predicting the possible winner. This report will pay attention to various circumstances that might occur in the process of predicting the result of the concerned sport; like a tie and how that circumstance affects the mathematical calculations.
It will also highlight the why’s and how’s of the method used for solving the described problem by offering explanation for the observed changes. The matrix observations will have limitations, which will be noted, and will be further analyzed with the help of another matrix. This report will explain how by using dominance matrix in the game theory, we could enhance the analysis and hence predict with higher accuracy.
Outline of the Problem:
Predict the final standings of the teams played in a competition, which was organized, in the round-robin format.
Use Dominance model to predict the results of at least 3 games of the event the event.
Methodology:
Investigate ways of refining the dominance model which might improve the predictions made, considering:
1. The sport for which we will be using dominance matrices to make predictions is Big Bash League (BBL). Big Bash League is an Australian Twenty 20 league for cricket (Big Bash, 2018). The league is features eight city-based franchises or cricket teams who compete against each other. Earlier this league was called KFC Twenty20 Big Bash, which was later renamed. Also, the league previously featured six state teams, which were later, replaced by 8 city teams.
It was established in 2011 by the Cricket Australia and that was the year when its first tournament took place. Big Bash League is an annual league, which happens in the month of December and January.
2. Dataset
We are taking the data of the BBL tournament, which happened in 2011 to make predictions using the dominance matrix (Kaggle, 2011).
As per the past statistics, out of the eight teams, five teams have won the title at least once in the tournament.
Description of dataset :
MatchDateSK - Date of Match played - (YYYYMMDD) Numeric
Team 1 - Team Name - String
Team 2 - Team Name - String
Winner - Winner of the match - String
Margin - Winning Margin - (wickets/runs) String
If Team 1 wins, winning margin is in runs. If Team 2 wins, winning margin is in wickets.
The report uses only MatchID, Team 1, Team 2, and Winner from season 2011.
In the first iteration, the report uses only first 8 matches played by each of the team in season 2011. To improve the model further, it will be introduced to more games and its results.
The detailed dataset of first 8 games played by each team: -
MatchDateSK | Team 1 | Team 2 | Winner | Margin |
20111216 | Syd Sixers | Heat | Syd Sixers | 7 wickets |
20111217 | Melb Stars | Syd Thunder | Syd Thunder | 6 wickets |
20111218 | Scorchers | Hurricanes | Hurricanes | 31 runs |
20111218 | Strikers | Melb Reneg | Strikers | 67 runs |
20111220 | Heat | Melb Stars | Melb Stars | 8 runs |
20111221 | Hurricanes | Syd Sixers | Hurricanes | 42 runs |
20111222 | Melb Reneg | Scorchers | Scorchers | 8 wickets |
20111223 | Syd Thunder | Strikers | Syd Thunder | 6 wickets |
20111227 | Syd Sixers | Melb Stars | Syd Sixers | 2 runs |
20111228 | Strikers | Hurricanes | Hurricanes | 14 runs |
20111229 | Scorchers | Heat | Scorchers | 10 runs |
20111230 | Syd Thunder | Melb Reneg | Melb Reneg | 6 runs |
20120101 | Hurricanes | Syd Thunder | Hurricanes | 5 wickets |
20120102 | Melb Reneg | Syd Sixers | Melb Reneg | 8 wickets |
20120103 | Heat | Strikers | Strikers | 31 runs |
20120104 | Melb Stars | Scorchers | Scorchers | 8 runs |
20120106 | Heat | Hurricanes | Heat | 3 runs |
20120107 | Melb Stars | Melb Reneg | Melb Stars | 11 runs |
20120108 | Syd Thunder | Syd Sixers | Syd Sixers | 17 runs |
20120108 | Scorchers | Strikers | Scorchers | 42 runs |
20120109 | Hurricanes | Melb Stars | Melb Stars | 19 runs |
20120110 | Strikers | Syd Sixers | Syd Sixers | 64 runs |
20120111 | Syd Thunder | Scorchers | Scorchers | 9 wickets |
20120112 | Melb Reneg | Heat | Heat | 12 runs |
20120117 | Heat | Syd Thunder | Heat | 91 runs |
20120118 | Syd Sixers | Scorchers | Syd Sixers | 1 run |
20120118 | Hurricanes | Melb Reneg | Hurricanes | 7 wickets |
20120119 | Melb Stars | Strikers | Melb Stars | 6 wickets |
Let us assume that the different codes given to the teams are as follows:-
Heat – Team 1
Hurricanes – Team 2
Melb Reneg – Team 3
Melb Stars - Team 4
Strikers – Team 5
Syd Thunder – Team 6
Syd Sixers - Team 7
Scorchers – Team 8
There is a total of 8 teams in the league, each team will play 7 matches against each other.
Hence, the no. of total matches will be 28, i.e. (7 matches * 8 teams)/2 teams per match
MatchDateSK | Team 1 | Team 2 | Winner | Margin |
20111216 | 7 | 1 | 7 | 7 wickets |
20111217 | 4 | 6 | 6 | 6 wickets |
20111218 | 8 | 2 | 2 | 31 runs |
20111218 | 5 | 3 | 5 | 67 runs |
20111220 | 1 | 4 | 4 | 8 runs |
20111221 | 2 | 7 | 2 | 42 runs |
20111222 | 3 | 8 | 8 | 8 wickets |
20111223 | 6 | 5 | 6 | 6 wickets |
20111227 | 7 | 4 | 7 | 2 runs |
20111228 | 5 | 2 | 2 | 14 runs |
20111229 | 8 | 1 | 8 | 10 runs |
20111230 | 6 | 3 | 3 | 6 runs |
20120101 | 2 | 6 | 2 | 5 wickets |
20120102 | 3 | 7 | 3 | 8 wickets |
20120103 | 1 | 5 | 5 | 31 runs |
20120104 | 4 | 8 | 8 | 8 runs |
20120106 | 1 | 2 | 1 | 3 runs |
20120107 | 4 | 3 | 4 | 11 runs |
20120108 | 6 | 7 | 7 | 17 runs |
20120108 | 8 | 5 | 8 | 42 runs |
20120109 | 2 | 4 | 4 | 19 runs |
20120110 | 5 | 7 | 7 | 64 runs |
20120111 | 6 | 8 | 8 | 9 wickets |
20120112 | 3 | 1 | 1 | 12 runs |
20120117 | 1 | 6 | 1 | 91 runs - Test data |
20120118 | 7 | 8 | 7 | 1 run - Test data |
20120118 | 2 | 3 | 2 | 7 wickets - Test data |
20120119 | 4 | 5 | 4 | 6 wickets - Test data |
The Top 4 teams were - 2, 4, 7, 8. Hence, these teams were the semi-finalists in the tournament.
Dominance Matrix of order 1, D:
Counting the results of the match and feeding it into the matrix prepare dominance matrix.
In the above matrices, the rows of the dominance matrix denote the team names, likewise in the matrix given below, the columns titles are now representing the different teams.
Total sum indicates the number of wins for each team, which can be used to rank them.
Here each defeat is represented as 0, and win is represented as 1 in the matrix.
For e.g.,
i(1,2)= 1 means, Team 1 defeats Team 2.
or
i(8,1)= 1 means, Team 8 defeats Team 1.
The original dominance matrix of the results of big bash season 2011 is given below:-
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | Total sum |
1 | 0 | 1 | 1 | 0 | 0 | 1 | 0 | 0 | 2 |
2 | 0 | 0 | 1 | 0 | 1 | 1 | 1 | 1 | 5 |
3 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 2 |
4 | 1 | 1 | 1 | 0 | 1 | 0 | 0 | 0 | 4 |
5 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 2 |
6 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 2 |
7 | 1 | 0 | 0 | 1 | 1 | 1 | 0 | 1 | 5 |
8 | 1 | 0 | 1 | 1 | 1 | 1 | 0 | 0 | 5 |
| | | | | | | | | |
As we can observe from the above matrix that there are 3 teams at the first position with equal points, i.e. Team 2, team 7 and team 8.
Problem Outline: How to assign a particular position to team 2, team 7 and team 8 in the top 3.
Method: We’ll find the squares and cubes of the original matrix D, one step at a time to break the tie between the three teams
Step 1: Dominance matrix of order 2, D2
We are squaring the original matrix D, which can be represented using formulae as:
D * D = D2
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | Total sums |
1 | 0 | 0 | 1 | 1 | 2 | 2 | 2 | 1 | 9 |
2 | 3 | 0 | 2 | 3 | 3 | 3 | 1 | 1 | 16 |
3 | 1 | 0 | 0 | 2 | 2 | 1 | 0 | 1 | 7 |
4 | 1 | 1 | 3 | 0 | 1 | 3 | 2 | 1 | 12 |
5 | 0 | 1 | 1 | 0 | 0 | 2 | 1 | 0 | 5 |
6 | 2 | 1 | 2 | 0 | 1 | 0 | 0 | 0 | 6 |
7 | 3 | 2 | 4 | 2 | 3 | 2 | 0 | 0 | 16 |
8 | 2 | 2 | 3 | 1 | 2 | 2 | 1 | 0 | 13 |
Here, i(1,5) = 2 represents that Team 1 has defeated 2 teams, and those two teams have defeated team 5.
Step 3: Dominance Matrix of order 3, D3
Now, for the last step we’ll find out the cube to the matrix D, which can be represented using formulae as:
D2 * D = D3
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | Total sum |
1 | 6 | 1 | 4 | 5 | 6 | 4 | 1 | 2 | 29 |
2 | 8 | 6 | 10 | 5 | 8 | 7 | 2 | 1 | 47 |
3 | 5 | 3 | 6 | 2 | 4 | 2 | 0 | 0 | 22 |
4 | 4 | 1 | 4 | 6 | 7 | 8 | 4 | 3 | 37 |
5 | 1 | 0 | 1 | 3 | 4 | 3 | 2 | 2 | 16 |
6 | 1 | 2 | 4 | 0 | 1 | 5 | 3 | 1 | 17 |
7 | 5 | 5 | 10 | 2 | 6 | 9 | 6 | 2 | 45 |
8 | 4 | 3 | 7 | 3 | 6 | 8 | 5 | 3 | 39 |
Now, let's assume that the last round (4 matches) have not yet happened. Thus, by removing the results/ scores of the last 4 matches from the matrix and we will be able to predict the result using the dominance matrix itself.
Step 1:Dominance Matrix of order D`:
Here is the updated dominance matrix-
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | Total Sum |
1 | 0 | 1 | 1 | 0 | 0 | - | 0 | 0 | 2 |
2 | 0 | 0 | - | 0 | 1 | 1 | 1 | 1 | 4 |
3 | 0 | - | 0 | 0 | 0 | 1 | 1 | 0 | 2 |
4 | 1 | 1 | 1 | 0 | - | 0 | 0 | 0 | 3 |
5 | 1 | 0 | 1 | - | 0 | 0 | 0 | 0 | 2 |
6 | - | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 2 |
7 | 1 | 0 | 0 | 1 | 1 | 1 | 0 | - | 4 |
8 | 1 | 0 | 1 | 1 | 1 | 1 | - | 0 | 5 |
| | | | | | | | | |
“-“: These represent the scores of the last round (4 matches) which were removed from the matrix.
For matrix calculations, we need to replace “-“ by “0”, otherwise the matrix could not be determined. Below is the updated dominance matrix, D`-
4 & 5. Problem outline:Use the dominance matrix to rank the teams on the results so far and make predictions about the outcomes of the three games yet to be played. Also, Discuss second and third order influences and their significance. Choose a supremacy model to use with your data and compare its predictions to those made for the three games yet to be played in part 4.
Step 1 :
Here, “0” represents either loss or did not play.
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | Total Sum | |
1 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 2 |
2 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 1 | 4 |
3 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 2 |
4 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 3 |
5 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 2 |
6 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 2 |
7 | 1 | 0 | 0 | 1 | 1 | 1 | 0 | 0 | 4 |
8 | 1 | 0 | 1 | 1 | 1 | 1 | 0 | 0 | 5 |
Now this dominance Matrix D` shows that the top 4 teams are:
Rank 1 - Team 8
Rank 2 - Team 2, 7
Rank 4 - Team 4
All other teams have equal points, so to rank them, we will generate a second order matrix.
This dominance matrix, D` will also help us to predict the result of the last round in the later steps.
As per the match schedule give in the dataset above, the last round of 4 matches will be held between the following teams-
Team 1 v/s Team 6
Team 7 v/s Team 8
Team 2 v/s Team 3
Team 4 v/s Team 5
Step 2: Dominance matrix of order 2, D’2
As per the match schedule given above, we couldn’t predict the result of the matches yet because as till the second last round (as per matrix D’) –
Hence, to determine which teams will go ahead and the ranking after 6 rounds, we need to prepare another matrix.
Squaring the dominance matrix D’, we get
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | Total Sum | |
1 | 0 | 0 | 0 | 0 | 1 | 2 | 2 | 1 | 6 |
2 | 3 | 0 | 2 | 3 | 3 | 2 | 0 | 0 | 13 |
3 | 1 | 0 | 0 | 2 | 2 | 1 | 0 | 0 | 6 |
4 | 0 | 1 | 1 | 0 | 1 | 2 | 2 | 1 | 8 |
5 | 0 | 1 | 1 | 0 | 0 | 1 | 1 | 0 | 4 |
6 | 2 | 1 | 2 | 0 | 0 | 0 | 0 | 0 | 5 |
7 | 2 | 2 | 3 | 1 | 1 | 0 | 0 | 0 | 9 |
8 | 2 | 2 | 3 | 1 | 1 | 1 | 1 | 0 | 11 |
Here, i(1,5) = 2 , represents that Team 1 has defeated 2 teams which defeated team 5.
Ranking after Round 6 :
Prediction :
Hence from the above matrix D’2, we can observe that –
Between team 1 v/s team 6 = Tie or team 1 (Not so sure)
Between team 7 v/s team 8 = Team 8
Between team 2 v/s team 3 = Team 2
Between team 4 v/s team 5 = Team 4
In order to calculate the rank between Team 1 and Team 3 & predict the result of match between Team 1 & Team 6, we need to calculate third order of dominance matrix.
Step 3: Let's calculate D'3 to predict result of 1 v/s 6 & the ranking:
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | Total Sum |
1 | 4 | 0 | 2 | 5 | 5 | 3 | 0 | 0 | 19 |
2 | 6 | 6 | 9 | 2 | 2 | 2 | 2 | 0 | 29 |
3 | 4 | 3 | 5 | 1 | 1 | 0 | 0 | 0 | 14 |
4 | 4 | 0 | 2 | 5 | 6 | 5 | 2 | 1 | 25 |
5 | 1 | 0 | 0 | 2 | 3 | 3 | 2 | 1 | 12 |
6 | 0 | 2 | 2 | 0 | 1 | 3 | 3 | 1 | 12 |
7 | 2 | 3 | 4 | 0 | 2 | 5 | 5 | 2 | 23 |
8 | 3 | 3 | 4 | 2 | 4 | 6 | 5 | 2 | 29 |
As per the above matrix it can be observed that the winner between team 1 v/s team 6 is predicted to be team 1. Also we can observe the actual ranking of the teams.
Ranking after Round 6 :
Hence, the final prediction of winners according to dominance matrix D` is as follows -
Team 1 v/s Team 6 - Team 1
Team 2 v/s Team 3 - Team 2
Team 4 v/s Team 5 - Team 4
Team 7 v/s Team 8 – Team 8
Actual Winners as per matrix D were-
Team 1 v/s Team 6 - Team 1
Team 2 v/s Team 3 - Team 2
Team 4 v/s Team 5 - Team 4
Team 7 v/s Team 8 – Team 7
So the dominance matrix in this case predicted results with 75 % of accuracy.
6.Problem Outline : Use the supremacy model to make a prediction of the final ladder placings of all the teams in your sample. Compare your result with the actual ranking at the end of the season and discuss the result.
After looking at the prediction of the results of matches of final round, here is the predicted final standings of the team -
Actual Standings after all the rounds were :
Accuracy of Dominance matrix - 87.25 %
7. Problem Outline : Investigate ways of refining your dominance model which might improve the predictions made. Considering:
•different supremacy models
•adding further game outcomes to the dominance matrix
•some way of incorporating winning margins
Ways to refine the model :
Ways to incorporate winning margin :
8. Problem Outline : Using the results from above, summarise the findings. Comment on how accurately your models relate to the real situation. Discuss and limitations of the models, and the reasonableness of the solutions found.
Using Dominance matrix and 2nd, 3rd order of supremacy vector, we found the ranking of team after round 6, predicted the results of round 7, and predicted the rank of teams after round 7.
Also, we generated the ranking for the tournament after round 7.
Our model predicted the results of round 7 with 75 % accuracy and ranks of teams with 87.5 % accuracy.
Hypertuning the features for this model, is a an operation overhead, as incorporating team’s winning margin, venue, weather, player’s details is a huge complex task for this model.
Conclusion
We can conclude that the dominance matrix is a reliable technique to make predictions based on current season performances about which team might win a competition.