Author: admin

  • The highs, lows and future of ethical machine learning

    The world’s most valuable resource is no longer oil, but data – The Economist

    Big data is the new oil. Just as oil enabled us to progress our civilisation, big data provides many opportunities for advancements and insights, primarily through machine learning and artificial intelligence. In the first of a two-part series, we explore the ups and downs of a booming industry and why ethics are increasingly centre stage.

    What is machine learning?

    Machine Learning typically refers to algorithms and statistical models that enable computers to draw inferences from patterns in data without having each step explicitly encoded by a human operator. Some of these algorithms may also self-adapt over time as new data is presented to the system. These systems may then be applied to tasks normally perceived to require human intelligence, such as visual perception, speech recognition, decision-making, and translation between languages.

    Typically, machine learning requires large amounts of data to achieve accurate results. Developments in computing power, storage and the increased digitalisation of the world have created the conditions to facilitate this requirement and allow for the data-based revolution to take place.

    What are the benefits of machine learning?

    Machine learning delivers many opportunities …

    Insights
    It can help us find new relationships in data. For example:

    • Finding similarities in bacterial and viral DNA sequences can help determine the evolutionary relationships between different strains. For disease-causing pathogens, this is an important tool for tracking outbreaks and to develop treatments.
    • In the same general field, scientists believe a recent development could transform computational biology. DeepMind’s AlphaFold, which uses deep learning, has vastly improved the ability of computers to accurately predict the 3D shape of proteins, comprehensively outperforming the prior state-of-the-art algorithms in a recent global competition. Knowing the shape of proteins is important in many research areas, including genetic diseases and drug discovery.

    Efficiency
    It can lead to greater efficiencies in processes. Natural Language Processing (NLP) was used to mine more than 200,000 research articles for findings that might be relevant to the treatment of COVID-19.

    Service
    It can improve some aspects of service delivery:

    • NLP tools like Google Translate are making communication easier across the language divide, opening numerous opportunities – personal, business and societal
    • Recommender algorithms like those used by streaming platforms can guide us to content that is of interest to us.

    Decision-making
    It can combine a lot of data into a more digestible form to assist with decision-making:

    • Medical diagnoses are crucial to get right. Machine learning systems can help to process a lot of data and/or X-rays to suggest possibilities and out-perform human experts. For example, a 2020 study reported in The Lancet medical journal found that an ML algorithm showed better diagnostic performance in breast cancer detection compared with radiologists.
    • Organisations (both government, NGO and corporate) can similarly use machine learning tools to inform their decision-making.

    The dark side of machine learning

    The data and machine learning revolution also has significant downsides.

    Misuse of data
    This is the use of data without permission. This is well exemplified by the Cambridge Analytica (CA) scandal where, among other things, CA harvested the personal data of up to 87 million Facebook users through an app called This is Your Digital Life. Users gave the app permission to acquire their data, which in turn gave the app access to their network of Facebook friends. Only about 270,000 users actually used the app, meaning the vast majority of people whose data was taken by CA had not given permission for their data to be used.

    Many journalists and experts consider Facebook is shaking the foundations of democracy

    However, the use of data is troublesome even when internet users ‘consent’ to share their data. How many of us actually read the small print outlining which permissions are granted? It’s likely that consent is frequently given without a true understanding of what exactly is being granted. And even if an individual denies permission, the CA case demonstrates that our social network contacts can unwittingly release our data without our explicit permission.

    Creation of echo chambers
    Platforms like Google and Facebook want us to keep coming back. To ensure this, they show us content they believe we are interested in, rather than a balanced view of whatever issue we are researching, be it climate change or politics, for example.

    The polarised political scene in the United States, for example, has highlighted the issue of ethical responsibility for social media giants, especially in considering the broader implications of showing people content that aligns only with a user’s opinions.

    Similarly, climate change denial and anti-vaccination campaigns owe no small part to recommender algorithms (such as those used by Google, YouTube and Facebook), sending people down a rabbit hole of anti-science rhetoric. If you want a good illustration of this, watch a YouTube video on something like the Flat Earth conspiracy theory and note the subsequent recommendations you receive.

    Externalities/unintended consequences
    In economics, an externality is a cost or benefit caused by a producer that is not financially incurred or received by that producer. In the world of machine learning, this could also be rephrased as unintended consequences, a kind of digital or cyber pollution.

    It’s unlikely Mark Zuckerberg set out to shake the foundations of democracy when he set up Facebook, and yet many journalists and experts consider Facebook does just that, both through the creation of echo chambers and facilitating the spread of misinformation. Even if corrected or removed afterwards, the misinformation lingers. As the Anglo-Irish satirist Jonathan Swift wrote in 1710, “Falsehood flies, and the truth comes limping after it.” Recent reporting by the New York Times suggests a conflict within Facebook between maximising user engagement and platform growth, and reducing the spread of false or incendiary information.

    In her book Weapons of Math Destruction, Cathy O’Neill argues that US university rankings introduced in the 1980s by a single news magazine created a feedback mechanism with self-reinforcing rankings – that is, lower ranked universities would be avoided by top students and academics, funding would reduce, and the ranking would fall even further.

    She also makes several other arguments, notably that the omission of college fees from the metrics used to assess universities is one contributor to the high cost of education at prestigious US universities. Dramatically altering the third-level education environment was not the objective of the news magazine, rather it was focused on ensuring its own survival and selling copies.

    So what can we do?

    There’s an obvious need for a societal response to these challenges. Regulations such as the EU’s General Data Protection Regulation (GDPR) are a small step in the right direction. A set of GDPR provisions are targeted towards AI, restricting automated decision-making (ADM) and profiling when there may be ‘legal’ or ‘similarly significant’ effects on individuals (for example, the right to vote, exercise contractual rights, or effects that influence an individual’s circumstances, behaviour or choices).

    The ePrivacy Directive and GDPR mandate that users must be able to deny access to cookies. However, it’s much easier for EU residents to click the Accept All Cookies button once, than have to go through the step of rejecting all cookies and saving this setting each time they visit the website. And that’s assuming the website is available – some content is not viewable within Europe to those who deny cookies.

    Clearly more needs to be done. While the systemic nature of the risks of machine learning and big data demand a system-wide approach to its regulation, it’s also important that individual users and developers of machine learning tools think about their own planned use and ensure that it’s ethical.

    Experts suggest Just War Theory can be applied to AI to achieve greater fairness

    What is Just War Theory and what can it teach us?

    On the back of the growing concern over the use of machine learning and the broader field of artificial intelligence (AI), Professor Seth Lazar, of the School of Philosophy at the Australian National University, and project leader of the interdisciplinary research project  Humanising Machine Intelligence, suggests using Just War Theory as a philosophical framework. Just War Theory is studied by theologians, ethicists, and policy and military leaders. It aims to set out conditions that must be met for a war to be considered just. There are two components:

    • Jus ad bellum: the right to go to war. In other words, can this war be justified? For example, a war to remove a genocidal dictator might be considered justified but a war to steal natural resources of a neighbouring country might not. Proportionality is also an important consideration here – the anticipated benefits must be proportional to the expected harms of the war.
    • Jus in bello: this considers how combatants should wage war – what acts are justified and what are not. For example, civilian targets would be prohibited, while prisoners of war should not be mistreated. Again, proportionality is important – the harm caused to civilians and civilian infrastructure must not be excessive relative to the expected military gain.

    International law, which contains some laws around war and military conduct inspired by Just War Theory, can be used to prosecute those accused of war crimes – for example, the Nuremberg Trials after World War II and the International Criminal Tribunal following the Balkan wars of the 1990s.

    Applying the past to the present
    Professor Lazar suggests asking similar questions of artificial intelligence:

    • Is it justified to use AI in the first place and in what circumstances? (jus ad bellum)
    • What is an acceptable way of using AI? (jus in bello)

    The second question is discussed a lot in machine learning and wider circles, where concepts of fairness, ethics and bias are regularly considered, but perhaps the first question is not addressed as often as it should be.

    Given the huge potential upsides and widespread use of machine learning and big data, a moratorium on its use is very unlikely to happen. Instead, as we grapple with the issues of being a society that is undeniably data driven, we should all ensure that any applications of it under our control are justified and fair.

    How can organisations – how can you – be more proactive about fairness?

    As business and government leaders look for ways to address issues of ethics and fairness with data and data technology, a good starting point when setting up (or reviewing) a system that uses machine learning is to consider if you should be using machine learning for this task in the first place. It’s a complex question, so answering the following may guide you to a decision:

    • What outcomes are you trying to achieve?
    • Are there unintended consequences (especially negative) of what you propose?
    • Are the potential benefits of the outcomes proportional to the potential harms of the process?

    The importance of diversity
    For many of these discussions, diversity is key – get a diverse set of stakeholders in the room, ideally including both those using and those who will be affected by the system, and have a robust conversation about all relevant issues.

    Part of this conversation should consider what you are currently doing. Ethics and fairness apply whether you use machine learning or not, so review the current state of play and whether your proposal improves this. Even if your proposal isn’t perfect (and it won’t be – we live in an imperfect world), it may be a sufficient improvement over the status quo to justify its use.

    The question of proportionality is critical here. Facial recognition software has many known problems, particularly for those with darker skin, but there may be some justification and acceptance for using it as part of a detective toolkit to find a mass murderer rather than a petty thief, and so long as there is appropriate human oversight.

    Be vigilant at every stage of the process
    If you conclude there is a justification for using machine learning, then you can move onto the second question – how can you use it in an ethical way? It’s important to note that you need to think of the system as a whole – from data collection to modelling to application. You can’t just focus on the machine learning component – a fair model means little if the way in which it’s used is unfair. Problems with a system can occur at any point in the pipeline, and not only at the machine learning stage.

    Next week in part two of our ethical machine learning series, we delve into what it means to be fair, the surprising ways bias creeps in and how to ensure dystopian sci-fi scenarios remain in the realm of fantasy.

  • Big little highs

    As general insurance undergoes transformation, it’s the large and small things that count, and players will need agile thinking to flourish.
    Taylor Fry’s Scott Duncan explains.

    Paradigm shifts are well underway in the general insurance industry, signalling a change from old-world thinking to new. To thrive in the new world, insurers must think both big and small. Big-picture items include understanding social changes, refreshing product offerings and ensuring agile thinking to capitalise on opportunities. Thinking small involves placing yourself in the customer’s shoes to build a better understanding of their lifestyle and requirements.

    Here’s our take on three areas undergoing the change from old world to new, and recent initiatives showing the way forward.

    Partnerships

    Old world: Partnering with organisations that share the same DNA, and will therefore complement your existing offerings. The old-world terminology used to describe the benefit of these relationships is ‘synergies’.

    New world: Building partnerships on a micro and macro level with organisations that think differently from you and will therefore challenge the way you do business.

    Launched in 2016 and backed by a $75 million investment, IAG’s Firemark Ventures is described as the ‘strategic investment group within IAG for start-ups and emerging growth businesses’. Firemark is concerned with exploring:

    • New and innovative sources of data
    • Anything that has potential to impact the insurance value chain
    • New business models and technologies that expand or redefine insurance needs.

    In other words, IAG aims to think like a start-up and change the way it does business.

    As general insurance undergoes transformation, players will need agile thinking to flourish

    The customer

    Old world: Primary focus is on the customer’s short term needs. Companies attempt to create additional value by offering discounts for multiple products and customer loyalty.

    New world: Developing a clear picture of the customer’s life stage, their lifestyle and needs. Reorienting the business to support a holistic view of the customer and aiming to interact with the customer on a regular basis.

    Suncorp allocated $142 million in FY18 to accelerate its marketplace strategy by creating a network of ‘brands, partners, solutions and channels’. The intention is to build a platform that encourages greater customer engagement and connection, and builds loyalty. To Suncorp, the value is clear: ‘Those [customers] who hold four products are nine times the value of those with just one product line.’

    New products and services

    Old world: ‘Squeezing’ customers into core products.

    New world: Flexibility, flexibility, flexibility – building on partnerships and developing greater understanding of the customer in order to offer appropriate products and respond to new ways of owning and using assets.

    Players are exploring opportunities to develop new products as risks to continued growth in traditional classes of business emerge. Examples include:

    • On-demand insurance. For example, Trov, in partnership with Suncorp, provides ‘on demand’ insurance for phones, laptops, tablets, wearables, headphones and photography gear.
    • Micro insurance in the sharing economy. Kevinsured.com offers cover for online interactions. Kevin uses blockchain and reviews the reputation of the buyer and seller in the transaction and presently, if approved, provides $100 of free cover.
    • Sharing economy. ‘Mobilise’ is a platform that allows businesses to hire out equipment they are not using. Mobilise launched in association with Aon, which will source cover for the equipment while on hire.
  • When the algorithm fails to make the grade

    In our latest article, the story of the UK algorithm to assign high school grades following exam cancellations teaches an important lesson for everyone building models where questions of individual fairness arise.

    There have been many consequences of the pandemic. While health and employment concerns are rightly prominent, education is another domain that has seen significant disruption. One recent story intersecting with modelling and analytics is the case of school grade assignment in the UK. With final year exams cancelled due to the pandemic, the Office of Qualifications and Examinations Regulation (Ofqual) was presented with the challenge of assigning student grades, including the A-level grades that determine eligibility for university entrance.

    Part of the challenge is that centre-assessed grades (grades issued by schools based on internal assessment) are always optimistic overall compared to actual exam grades, so the process required choosing the best way to move grades closer to historical patterns. An algorithm was created to produce predicted grades across the whole student cohort.

    However, when results were posted out there was student outrage at the perceived unfairness of people who received a lower grade than they expected. Pressure led to all governments across the UK backflipping and announcing that centre-assessed grades would be recognised instead of the algorithmic grades. While a win for many students who felt they deserved higher grades, it does raise significant further questions and represents a poke in the eye for those who stood by the robustness of the algorithmic grades.

    In many ways the Ofqual algorithm for adjusting grades ticked all the right boxes:

    • The process was thorough and transparent, with a detailed report released explaining the methodology, alternatives considered and a range of fairness measures to ensure particular subgroups were not discriminated against.
    • The process used available data well, incorporating a combination of teacher-assessed rankings, historical school performance and cohort-specific GCSE (roughly equivalent to our school certificate) performance to produce grade distributions. Such approaches are also used elsewhere. For example, in NSW HSC school assessment grades are moderated down using school rankings so they reflect a cohort’s exam performance.
    • The process gave some benefit of the doubt to students, allowing for some degree grade inflation. For courses and school cohorts where there were only a small number of students, more weight was given to centre-assessed grades.

    However, with the benefit of hindsight, it was clear that effort was not enough. The main factors contributing to the government backdown:

    • The stakes are very high. For many students, the difference between centre-assessed grades and modelled grades is the difference between their preferred university degree and an inferior option (or no university admission at all!). Students have a strong incentive to push back on the model.
    • Accuracy is good, but it was not great. While the report was careful to describe expected levels of accuracy (and choose methods that delivered relatively high accuracy), the reality is that a very large fraction of students got the ‘wrong’ grade, even if the overall distribution was fair. Variability across exams is substantial, and a very high level of accuracy would be required to neuter criticism and disappointment.
    • There were still some material fairness issues. Smaller courses are disproportionately taken by students at independent schools, and under the model these grades were less likely to be scaled back. Thus students attending independent schools were more likely to benefit from leniency provisions.
    • The model unilaterally assigned fail grades to students. The modelling included moving a substantial number of people from solid pass grades into the “U” grade (a strong fail grade, literally ‘ungraded’). There’s a natural ethical question whether it is fair to fail students who were not expected to fail according to their teachers, based on school rates of failure in prior years.
    • Perhaps most importantly, the approach failed to provide a sense of equality of opportunity. If you went to a school that rarely saw top grades historically, and your school cohort’s GCSE results were similarly unremarkable, there was virtually no way that you could achieve a top grade in the model. This does not sit well with students; the aspiration is that any student should be able to work hard and blitz their exams. Instead, students felt that they were effectively being locked into disadvantage, if they had attended a school with historically lower performance.

    Unsurprisingly, the final solution (adopting the centre-assessed grades) will create its own problems. Teacher ‘optimism bias’ is unlikely to be uniform across schools, so students with more realistic teacher grading will be relatively disadvantaged. Teacher grades may be subject to higher levels of gender or ethnic bias. The supply of university will not grow with the increased demand implied by higher grades; in some cases, this may be handled through deferrals which may have knock-on effects for availability for 2021 school finishers. And overall confidence in Ofqual has taken a substantial hit.

    I think there are some important lessons here for data analytics more generally. First, models cannot achieve the impossible; in this case, it is impossible to know which students would have achieved a higher or lower mark. In a high-stakes situation, such limitations can break the implementation of a model. Second, it raises the point that something that appears ‘fair’ in aggregate can look very unfair at the individual level.

    In situations where individual-level predictions have a significant impact, we should spend time understanding how results will look at that granular level, and who the potential ‘losers’ of a model are. Finally, an algorithm will often become an easy target. As we’ve also seen in COMPASS and robodebt coverage, a faceless decision-making tool carries a high burden of proof to establish its credibility; this requirement applies from initial model design through to results and communication. Appropriate use of modelling is something we will need to continue to strive for in our work.

    #InTheNews – “England exams row timeline: was Ofqual warned of algorithm bias?” from @guardian https://t.co/MjCKgyYc9V#NAPCE #pastoralcare #schools #education #teachers #exams #childwelfare #studentwelfare #covid19 #gcses #alevels pic.twitter.com/ruL5QUaBrs

    — NAPCE (@NAPCE1) August 21, 2020

    UK ditches exam results generated by biased algorithm after student protests https://t.co/ZQtWT1iqJe pic.twitter.com/G6RAldar59

    — The Verge (@verge) August 17, 2020

    As first published by Actuaries Digital, 24 September 2020

  • New Zealand general insurance 2020 update: news, views and trends

    How is the New Zealand general insurance market faring in 2020, and what are its biggest challenges going forward? Using the latest data and our market insight, we shed light on the New Zealand general insurance landscape, covering profitability, COVID-19, regulatory issues and more.

    Overall profitability strong, driven by few weather-related events

    Despite the uncertainty and global challenges of the ‘new normal’ brought on by the pandemic, general insurers in New Zealand have enjoyed a period of profitability over the past calendar year. This profitability – shown in the figure below – has been heavily influenced by relatively few weather-related large events between July 2018 and October 2019.

    Source: RBNZ Quarterly Insurance Financial Performance

    The run of benign weather was broken by the Timaru hailstorm in November 2019, which resulted in total losses of more than $130m 1. In contrast to most New Zealand storm events, where typically property classes are most impacted, more than half of the costs of the Timaru hailstorm have been for motor claims.

    Source: ICNZ cost of natural disasters

    In 2019, property risks for personal lines and commercial lines classes of business in the New Zealand general insurance market had their lowest loss ratios in the past five years2, predominantly due to the lack of large weather events over that period.

    Source: ICNZ Market Data

    Reinsurance woes mean bigger costs for customers

    The favourable claims experience in 2019 resulted in a flattening of premium rates for the personal lines property classes. Annual inflation for home insurance for the year to 30 June 2020 was 0.4%, following  three years of increases greater than 5%. Premiums for contents insurance have however continued to increase, although average premiums are generally lower for contents compared to home, due to lower levels of sums insured.

    The increase in premiums for home and contents over the past five years has mainly been due to increases in reinsurance costs. The figure below shows gross premiums, which are the costs that customers pay, have increased by an average of 4.8% per annum over the past five years. Net premiums, which are the costs excluding reinsurance, have increased by only 0.9% over the same period.

    Source: ICNZ Market Data

    The premiums New Zealand homeowners pay are strongly influenced by the global reinsurance market, and globally, reinsurance losses have been increasing as shown in the figure below. This is due to a combination of increases in the number and severity of natural disasters, and growth in the size of the market, as emerging markets seek reinsurance cover. In Australia, the bushfires of early 2020 are expected to result in the Australian market having its first underwriting loss since 20113.

    Source: Swiss Re – Sigma explorer

    Global reinsurers hit hard by the pandemic

    Reinsurance rates are also impacted by the capacity of the global reinsurance market, that is, the amount of capital held by reinsurers that enables them to write reinsurance business. Between 2008 and 2017 reinsurance capacity was increasing. This increase in capacity helped limit the size of reinsurance premium increases.

    Reinsurance capacity has been stable since 2017, but early indications are that capacity has reduced in 2020 as the impact of COVID-19 hits global reinsurers in the form of increased losses, weakening balance sheets from reductions in asset values and reduced investment returns from lower interest rates. Swiss Re recently announced it expects reinsurance premiums to increase for the 2021 year4. We expect New Zealand property risks to continue to increase as reinsurance costs increase in the New Zealand general insurance market.

    Source: Aon reinsurance market outlook June/July 2020

    COVID-19 creates uncertainty for insurers

    The COVID-19 pandemic has already had a large impact on the entire New Zealand economy, and the way companies operate. The government-imposed lockdown during April and May meant insurers had to move quickly to remote working. The lockdown also resulted in a reduction in claims for motor insurance. In response, three insurers announced they would be partially refunding premiums to motor customers due to the reduced level of claims during the lockdown5.

    In the longer term, the economic impacts of COVID-19 are uncertain. The recent outbreak in Auckland has shown strong border controls will need to remain in place for the foreseeable future. With tourism playing a large part in the New Zealand economy, the longer these restrictions remain in place, the larger the financial impact on the country. The government has so far been proactive in providing support to New Zealand businesses, which has limited the impact on key economic trends, such as unemployment and GDP, but it’s likely more financial pain lies ahead.

    Unsurprisingly, COVID-19 has also created uncertainty for New Zealand general insurers. From a claims perspective, the impact will vary between classes of business. For example:

    • Traffic volumes tend to reduce during economic downturns
    • An increase in the number of people working from home may result in lower numbers of contents claims due to theft
    • Economic downturns can often result in an increase in fraudulent claims, which may result in an increase in claim frequency for personal contents and commercial property claims
    • Claims under credit insurance policies will also increase as unemployment starts to increase, although this is a relatively small class of business in the New Zealand market.

    Low premium volume: a recession reality

    The greater impact for New Zealand general insurers will be on claims volumes. Already all major insurers have introduced hardship measures to support customers. Suncorp announced $30m-worth of customer support in its year-end results, although more than half of this relates to the AA Insurance’s motor premium refunds. A recession will see a reduction in premiums as customers in financial hardship look to reduce their costs. This is likely to hit the personal lines and commercial SME classes the hardest.

    The other component insurers will need to monitor is how the economic crisis affects asset values and investment returns. Falling interest rates will reduce investment returns, although this will be offset by increasing values for fixed-interest assets. Most New Zealand insurers have quite short-tailed liabilities and so are less affected by investment returns. Even so, it’s still important for insurers to monitor their assets as well as their liabilities and consider how their values will respond in various economic scenarios.

    Climate change impacts should still be front of mind

    While COVID-19 has been the focus for most insurers over the past six months, it hasn’t reduced the impact of longer-term risks. In late 2019, the annual General Insurance Barometer, Taylor Fry’s joint publication with JP Morgan that surveys Australian general insurers6, identified climate change, with its associated natural perils, and regulation as the two main issues for insurers. The ongoing and far-reaching implications of climate change for insurers remain, despite being overshadowed by COVID-19 over the last six months.

    These issues will be no different for New Zealand insurers, as climate change effects will have an unfortunate synergy7. Climate change is expected to increase the frequency of severe meteorological events, such as floods, landslides, strong winds and hail. The higher temperatures and rainfalls arising from climate change will also shorten the lifespan of many buildings. The combination of these effects will result in higher insurance losses in the future. Independent analysis of the effect of climate change on residential property damage estimates insurance losses of between 7% and 8% between 2020 and 2040, and an increase of between 9% and 25% from 2080 to 21008.

    Insurers in the New Zealand general insurance market will need to watch out for the emergence of weather-related risks that have not historically been a major concern. The summer drought in Auckland and Northland is potentially a warning sign that risks such as bushfires may increase in New Zealand if we start to see significant changes in regional climates. Over the past four years, we have seen property losses due to summer bushfires in the Port Hills (2017) and Tasman district (2019). While insurance losses for these events have been smaller than other weather-related events, a bushfire in a more densely populated region (such as the Waitakere Ranges) could result in higher losses.

    Regulatory developments

    On the regulatory front, insurers have plenty of activity to keep a close eye on:

    • In July, the RBNZ wrote to insurers outlining its expectations regarding insurers’ capital positions9: given the severe economic uncertainty facing the country, it expects insurers to protect or build their capital positions. In particular, insurers should not be taking any actions that reduce capital, including any reductions due to payments of dividends. This advice makes clear that the RBNZ views the current solvency levels of insurers as being too low, despite an increase in solvency capital over the past two years for all insurers that comply with New Zealand solvency requirements (as opposed to meeting solvency through an overseas parent company). Given the likely reduction in premium volumes resulting from the economic downturn, it’s vital insurers continue to manage their claims and expense costs to maintain capital levels. It also makes sense for insurers to continually monitor their projected capital positions, which will be more uncertain over the next few years due to the unknown future effect of COVID-19.

    Source: Insurers websites/accounts from New Zealand Companies Office

    • In March, the RBNZ released the findings of its thematic review of the Appointed Actuary regime10. The main theme from the review was a need for the RBNZ to provide clarity and guidance on its expectations of the Appointed Actuary role. The review found that most insurers saw the purpose of the Appointed Actuary was to fulfil statutory requirements. The review makes it clear, however, that the RBNZ’s expectations are much wider than this. It expects the Appointed Actuary to be involved in the insurer’s strategy and be strongly engaged with the insurer’s Board and the RBNZ. Insurers will be watching out for RBNZ consultation on a purpose statement on its expectations and objectives of the Appointed Actuary role.
    • The RBNZ has also just announced the continuation of its review of the Insurance (Prudential Supervision) Act 2010, which was suspended in 2018 due to other priorities11. This review will also include a review of the associated Solvency Standards, with a likely introduction of trigger points for regulatory action based on the level of solvency ratios, compared to the current ‘pass or fail’ approach. The announcement was made by RBNZ Deputy Governor Geoff Bascand in a speech to ICNZ12, where Bascand also outlined the RBNZ’s planned response to a world of heightened risk. In particular, the key developments that insurers can expect to see “… will be more intense supervision, particularly in relation to verifying information received from insurers, and concluding enquiries more efficiently”.
    • The other area insurers will be keeping a close eye on is the EQC review. A public inquiry into the EQC was completed by Dame Silvia Cartwright in March 2020. One of the recommendations from this review was the current EQC cap on cover be lifted from $150k to $400k. The Government has accepted the intent of this recommendation and asked Treasury to consider it as part of a piece of work it’s undertaking to modernise the Earthquake Commission Act 1993 following the inquiry13. There’s a long way to go on this recommendation, but if it were undertaken by a future Government it would have a substantial impact on home insurance.

    Signs of possibility amid the challenge

    It’s fair to say 2020 has been a very unusual year and has added additional uncertainty into an industry already dealing with more than its fair share of it. The short-term uncertainty surrounding COVID-19 will be taking insurers’ immediate focus, yet longer-term risks such as climate change and regulatory developments have not diminished. New Zealand insurers have shown in their initial responses to COVID-19 a business agility and willingness to embrace change, promising signs the industry is well placed to continue adapting to an ever-changing future.

  • Modelling Victoria’s second wave of COVID-19 cases

    Last month, we looked back at the trajectory of COVID-19 in Victoria, modelling the second wave of COVID-19 cases to see how effective various restrictions were at slowing the spread of the disease. Since then, the Victorian Government has introduced several additional restrictions. In this article, we use the same modelling techniques to take a retrospective look at how effective these restrictions have been.

    In our initial analysis, we looked at COVID-19 infection data in Victoria up to 3 August. This meant we were able to estimate the impact of the various restrictions put in place in June and July, including:

    • Restrictions on family gatherings towards the end of June
    • Progressive lockdowns on 2 and 9 July
    • Introduction of mandatory facemasks on 23 July

    We found each introduced measure did lead to a reduction in the transmission rate of COVID-19. However, after the introduction of mandatory facemasks the transmission rate was balanced on a knife’s edge – while the epidemic was under control, without further measures the number of new cases would not have reduced.

    Of course, there were further measures – the full metropolitan lockdown on 2 August and the regional and business measures on 6 August, and there was some evidence at the time that a turning point might have been reached.

    Now, over a month later, no modelling is necessary to see that, indeed, the peak in cases occurred in late July/early August and has been declining since, with levels over the last few days similar to those seen back in June. The strict conditions endured by Victorians, those in Melbourne in particular, have paid off in a significantly decreased disease incidence.

    Daily reported cases of COVID-19 in Victoria

    We’ve rerun our analysis (based on that used by German researchers – Dehning et al – to measure the impact of measures at different points in time), focusing on answering the following questions:

    • With updated data (the Department of Health and Human Services in Victoria makes retrospective changes to data as more information comes to light about each case), do we still estimate an Reff of 1 after the introduction of mandatory facemasks?
    • What was the impact of the lockdown measures on 2 and 6 August?
    • What have the transmission rates been like over the past few weeks?

    What did we find?

    In the figure below, we plot the effective reproduction number as estimated by the model, and measure the change in the reproduction number between key dates, such as easing or re-introduction of restrictions. The effective reproduction number is a measure of how many people an infected individual can pass the disease onto, and it measures the trajectory of the disease. Reproduction numbers:

    • Above 1 are indicative of a growing epidemic
    • Below 1 correspond to decreasing numbers of new infections.

    Effective COVID-19 control strategies reduce the reproduction number so it is below 1. When the reproduction number is 1, new cases remain at a constant level.

    The figure below shows the estimated reproduction number from July onwards, with the corresponding numerical estimates shown in the table below. The model assumes that changes occur over a number of days, usually about 3 days, so the reproduction number estimated at the 23 July change point (0.99) is reached about 3 days afterwards, on 26 July. A similar delay affects the other points.

    Estimated reproduction rates at key dates over June, July and August

    Estimated reproduction rates at key dates over June, July and August

    Intervention dateSignificanceEstimated reproduction number following the intervention
    9-JulLockdown of Melbourne and Mitchell Shire1.4
    23-JulIntroduction of mandatory masks0.99
    2-AugLevel 4 restrictions in Melbourne0.81
    6-AugLevel 3 restrictions in regional Victoria and business restrictions0.67
    13-Aug1 week after last intervention0.55
    20-Aug2 weeks after last intervention0.53
    27-Aug3 weeks after last intervention0.51

    We see the answer to our questions in the figure and the table, namely there was:

    • A reproduction number of essentially 1 after the introduction of mandatory facemasks on 23 July, indicating case numbers would not have reduced without further interventions
    • A significant reduction in spreading rates after the introduction of Level 4 restrictions in Melbourne on 2 August
    • A further significant reduction in spreading rates after the introduction of regional and business restrictions on 6 August – prior to that a significant number of infections had been acquired at work
    • Some further reductions over the following 2 weeks, with the reproduction number estimated to be around 0.5 in late August.

    What does this mean for future case numbers?

    We’ve shown the actual and modelled case numbers below – our model used data up to 16 September so we’ve highlighted the data since then in orange. Numbers continue to trend downwards. If things stay on track, then signs are positive for Victorians, and consequently all Australians, as we head into the summer months.

    Actual and modelled daily reported cases of COVID-19 in Victoria

    A bit more on how we did it

    The methodology outlined in the paper Inferring change points in the spread of COVID-19 reveals the effectiveness of interventions combines a standard epidemiological model of disease (the SIR model) and a list of intervention dates that result in changes in underlying epidemiological parameters change. The researchers used a Markov Chain Monte Carlo method to estimate various model parameters at different points in time, and the uncertainty in these estimates. We have replicated this work for Victorian data.

    We have assumed an approximately eight-day delay between infection and identification. This consists of the total time of incubation period, onset of COVID symptoms sufficient to warrant a test, and getting and receiving test results. Roughly speaking, this means we assume cases notified today relate to infections that took place, on average, eight days ago. Since the modelling approach uses simulations, a range of delays, centred on eight days, are used in practice.

  • RADAR 2020

    Welcome to RADAR 2020, Taylor Fry’s inside look at the general insurance industry, the state of the market and what it means for insurers.

    Here’s an overview of the highlights from RADR 2020, our class-by-class analysis to the end of an eventful FY2020 …
    • While pandemic concerns have dominated headlines, the longer-term impacts and risks of a warming climate remain as urgent as ever. This shows in the overall underwriting results for general insurers, which worsened slightly over the year to FY2020, affected by catastrophic weather-related events in excess of $5.2B. The extreme bushfire season led to a NSW inquiry and a royal commission aiming to improve how we mitigate and respond to natural disasters.
    • Insurers have been significantly impacted by COVID-19, though experience varies by class of business. Travel insurance has been hit particularly hard, experiencing falls in premium and surges in claims. Ongoing uncertainty in the travel industry is also affecting staff engagement and morale. Other classes such as motor have had the lowest loss ratios on record. The ability of insurers to rely on certain forms of pandemic exclusions to deny business interruption claims is being tested in the courts. There are likely to be adverse impacts ahead for most lines of business, as poor economic conditions and increasing community hardship constrain premium increases and pressures on claims continue.
    • Regulation of insurer conduct and disclosure obligations continued to strengthen in the aftermath of the Hayne Royal Commission, heralding a cultural shift in the way insurers deal with their customers. Implementation timelines for several initiatives have been extended by six months in recognition of COVID-19, though insurers have fast-tracked their support for customers who are experiencing vulnerability and financial hardship under the new 2020 General Insurance Code of Practice.
    • Profitability for many commercial classes continue to be under pressure. Profitability for commercial property was affected by catastrophic weather events, while directors and officers have been adversely impacted by class actions over several years, with potential for the recent economic downturn to instigate further claims activity. Elevated cyber risk has also raised the potential for ‘silent cyber’ claims to impact several classes of commercial insurance, where an insurer may have to pay claims for cyber-related losses under a traditional insurance policy not designed for the purpose.
    • Insurers are anticipating an increase in primary and secondary psychological claims in workers compensation. These are expected to arise from changes to work demands, shifts in working arrangements and COVID-19-related restrictions.
    • Overall reserve releases on long-tailed classes were subdued during FY2020, which put further upwards pressure on incurred claims and loss ratios. Public and products liability as well as professional indemnity experienced reserve strengthening during FY2019 and again in FY2020, which contrasted with several years of reserve releases in the preceding years.

    Download RADAR 2020 for more expert insights on the shifts and trends in the industry to help you navigate the uncertainty and discover opportunity in our evolving insurance landscape.

    Find out more about our Appointed Actuary and General Insurance services.

  • Algorithm Charter for Aotearoa: Six things to be doing now

    The New Zealand Government has recently released its Algorithm Charter for Aotearoa New Zealand. Effectively, the Algorithm Charter is a call to action for government agencies to provide New Zealanders with the confidence that algorithms are being used sensibly and ethically. We look at how government agencies can use the Charter to drive the effective and efficient delivery of government services in New Zealand.

    Developments in computing have enabled ever increasingly sophisticated algorithms to support decision-making in all facets of our life. Algorithms are sequences of steps used to solve problems and when paired with large underlying databases for training, algorithms can help us plan and guide our lives. Alongside new playlist recommendations and advertisements for things we never knew we needed, algorithms are increasingly responsible for improvements in the quality of services offered by governments to their citizens. There are a wide range of operational functions already being guided by algorithms in New Zealand, including elective surgery prioritisation, allocation of Work and Income clients to services, and youth offending risk screening. Stats NZ’s 2018 Algorithm Assessment Report provides a useful stocktake.

    As New Zealand government organisations scale up their use of algorithms, public servants are rightly nervous about getting it right. With great power (and highly sensitive administrative data) comes great responsibility, and highly publicised overseas government algorithm fails (for example, policing algorithms were judged to be racist) are front of mind for many.

    What are the major risks when using algorithms for government decision making?

    Algorithms encompass simple techniques, such as regression models and decision trees (which can be used to make predictions and streamline business processes), through to more complex approaches like neutral networks and ensemble models (closer to machine learning). Ideally, they streamline processes, make predictions, and reveal insights about problems in a way not possible for the human mind alone. However, if algorithms are not built with care they can be ineffective, or at worst dangerous.

    The major risks to be aware of when using algorithms for government decision making need to be clear. These include:

    Unfairness

    The data used, or the way the algorithm is built can result in some sectors of society being unfairly targeted (or not targeted) by the services informed by the algorithm. For example, earlier this year, an African American man in Michigan was wrongfully arrested and held in a detention centre for nearly 30 hours after facial recognition technology incorrectly identified him as a suspect in a shoplifting case. Relatedly, the U.S. Department of Commerce’s National Institute of Standards and Technology’s Face Recognition Vendor Test looked at 189 software algorithms from 99 developers and found higher rates of false positives for Asian and African American faces relative to images of Caucasians.

    Government agencies need to be particularly sensitive to ensuring the fairness of their algorithms, as they must meet human rights legislation which prohibits discrimination on the grounds of gender, sexual orientation, religious and ethical beliefs, colour, race, ethnicity, disability, age, marital status, political opinion, employment status and family status. However, satisfying ‘fairness’ criteria across all of these measures at the same time may prove challenging, and organisations will need to understand the trade-offs and make clear determinations on which they are adhering to, and why.

    Ensuring fairness may mean some variables need to be restricted from being used as inputs to an algorithm, but grey areas remain. For example, if ‘gender’ is restricted, a secondary variable used as an input, such as ‘hobby’, might be strongly correlated. So as a result, the algorithm could inadvertently discriminate based on gender anyway.

    Accountability

    How algorithms inform decisions may not be clear, meaning that biases may go undetected. This makes clear lines of accountability critical. Without clear accountability, the safe implementation of an algorithm will be jeopardised. Machine learning and AI algorithmic techniques are particularly prone to ‘black box’ executions, where deciphering how the algorithm works is difficult. But it doesn’t have to be this way. Interpretable ‘glass box’ algorithms that facilitate strong lines of accountability are entirely feasible when the right techniques are employed. Our recent article, Interpretable machine learning: what to consider discusses these issues in more detail.

    Lack of transparency and consequently accountability were an issue for Microsoft when, in 2016, it launched a chat bot on Twitter called Tay. It was programmed to learn how to interact like a real human, through reading and processing real tweets. However, human oversight was lacking as the black box model evolved and within hours it transformed from an innocent bot into an offensive, loud-mouthed racist.

    Ongoing alignment with rules and policy

    Periodic review of algorithms to ensure they remain fit for purpose may not occur, and human oversight, an essential component in protecting against sub-optimal outcomes and perverse incentives, may not be retained beyond an initial implementation period. While algorithms can inform targeting of services and save time, they need to align with the problem to be solved and should be updated as the broader environment evolves. The Australian’s Government’s ‘Robodebt’ saga is a cautionary tale. The Online Compliance Intervention was an automated debt recovery program introduced in mid-2016 in an attempt to ensure recipients of welfare benefits were not under-reporting their income and, as a result, over-receiving welfare payments. Almost half a million Australians received correspondence regarding overpayments, but hundreds of thousands of the assessments appeared wrong, because the algorithms had included income from other periods, where people were actually in paid employment and not claiming benefits. The Australian Government is now paying refunds to all 470,000 Australians who had their debt calculated using this income averaging methodology. Human oversight is critical to the safe operation of algorithms and a social licence to do so.

    The issues we’ve briefly highlighted are all inter-related. Algorithms need to get it right, every time, because any single issue can erode public confidence irrevocably, no matter how effective the algorithm is.

    Where does the Algorithm Charter for Aotearoa fit in?

    Recognising these risks of mismanagement, the New Zealand government recently released the Algorithm Charter for Aotearoa New Zealand. It positions New Zealand as a world leader in setting standards to guide the use of algorithms by public agencies. Many government agencies have already signed up to the charter. The charter includes several commitments designed to ensure the appropriate management of the risks outlined above.

    What should agencies do to ensure they are implementing the Charter well?

    The Algorithm Charter for Aotearoa New Zealand is an enabler. By abiding by standards of fairness and transparency set out in the charter, government agencies can improve public confidence in the use of algorithms in government decision making. Without it, the use of algorithms may continue to be met with community resistance. Ideally, greater public trust, built on the foundations of this charter, will lead to greater use of algorithms (and leveraging of data more broadly), and ultimately more effective and efficient government services. A win for service users, communities and government agencies alike.

    We outline six things government agencies can do to position themselves well for implementing the charter:

    1. Review your services and identify which are informed by algorithms.

    Government agencies should perform a stocktake of all algorithms that are informing operational decisions. In most cases the relationship between a service and an algorithm is direct, for example, an algorithmic tool that informs the prioritisation of patients for surgery. However, sometimes the relationship may not be direct, such as when output from one algorithm is hard coded into another. Expert advice may be beneficial to support a stocktake.

    1. Prioritise algorithms for review.

    The Algorithm Charter offers an easy rule-of-thumb approach, suggesting government agencies rate each algorithm based on:

    • The likelihood of unintended adverse outcomes arising from use of the algorithm
    • The impact that unintended adverse outcomes would have on New Zealanders

    This rule of thumb is useful for an initial high-level prioritisation, however judging the likelihood of unintended adverse outcomes may be more difficult, particularly where the algorithm lacks transparency. In such cases, it will be beneficial to gather together subject matter and technical experts to work through the implications.

    1. Peer review (in priority order) algorithms against the Algorithm Charter commitments.

    Peer reviews should be thorough. Identifying bias in algorithms can be the most difficult, but arguably the most important, aspect of a review.

    Public confidence in any review is likely to be enhanced if it is independent from the creators and end-users of the algorithm.

    1. Adjust algorithms where they do not meet Algorithm Charter commitments fully.

    Modern algorithmic techniques can be employed to address most issues. Transparent, interpretable algorithms are entirely possible. And while building public confidence in the use of algorithms may take time, it is an attainable outcome if charter commitments are adhered to.

    It will be beneficial at this stage to think about what might be considered fair or unfair by society in terms of the way the algorithm affects decisions. Any adjustments to algorithms will need to be designed with fairness in mind.

    Releasing public advice that a review has been undertaken and that some algorithms have been adjusted as a result will help make it clear that algorithms are being effectively monitored under the charter.

    1. Put in place a periodic review process for each algorithm.

    The regularity and depth of periodic reviews for each algorithm should be commensurate with the assessed likelihood and impact of unintended adverse outcomes. Your organisation should also have a management plan in place for its algorithms in the event they result in unintended consequences for New Zealanders. The plan should become operational as soon as issues are identified.

    1. Set up guidelines for the development and management of new algorithms.

    The Algorithm Charter puts in place generic commitments. Each government agency will need its own set of more detailed guidelines for new algorithms. These will need to reflect the nature of the services a particular government agency provides, the sensitivity of service users to the use of algorithms, and the use of their data to inform them. For example, high sensitivity over the use of personal health data may dictate specific guidelines relating to the use of algorithms in the health sector.

    Embrace the opportunities presented by the Algorithm Charter

    While the scale of implementing the Algorithm Charter for Aotearoa shouldn’t be underestimated, there is a substantial upside to getting it right. Increased public trust in governments using algorithms for good should provide decision makers with more confidence to identify new opportunities to improve services using algorithms. This would be of particular benefit in fields where lack of confidence has previously prevented their use.

  • Interpretable machine learning – what to consider

    Well-designed machine learning models can be powerful tools for driving better decision making and growing your business. But what if your models are inadvertently creating problems for your business or clients? For ‘black box’ models, it can be difficult to ‘look under the hood’ and find out what’s driving model performance. In this article, we look at what you should consider to create explainable and interpretable machine learning models that perform ethically.

    What does my model really do?

    Algorithms and machine learning are affecting us as individuals more and more. On a day-to-day basis, it’s seemingly routine applications such as the books or movies we’re recommended, or the search results returned by Google. But they also feature in more significant decisions: whether our loans are approved and how much our insurance costs, through to life-changing decisions made in healthcare and criminal justice settings. And even routine applications such as YouTube or Google search results sometimes have far-reaching consequences – including creating echo chambers only showing us what the algorithm thinks we want to see. As a harsh light is shone on discrimination and biases within our society, the ethics around some of these uses of machine learning (ML) are becoming increasingly more topical.

    In particular, the machine learning community is paying close attention to topics of fairness, interpretability, transparency and accountability, and new data protection and privacy laws attempt to address some of these concerns. As we’ve discussed previously, much current ML research centres on developing tools for explainable ML. Many ML models are considered ‘black box’ models – models that produce decisions based on various inputs, but the process by which the decisions are made are opaque, usually due to model complexity.  Explainable ML essentially involves a post-processing stage, which takes the output of the black box model and overlays a mechanism to explain the results of it. This focus on ‘explainability’, however, can be problematic. Many such explanations are poor, incomplete and not particularly meaningful, merely justifying what the model does. Cynthia Rudin, Professor of Computer Science, Electrical and Computer Engineering, and Statistical Science at Duke University has advocated for a push towards ‘interpretability’ over ‘explainability’.

    In contrast to explainability, interpretability actually requires a full understanding of the path of computations leading to a prediction. In an article in Nature Machine Intelligence, appropriately titled Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead, Rudin and her collaborators argue:

    • There’s often a false dichotomy between interpretability and accuracy, and interpretable machine learning models can often perform just as well as black box models. Rudin and her team entered a competition to build an explainable black box model, and found the interpretable model they constructed instead actually had similar predictive accuracy to the best black box model.
    • It’s highly likely an interpretable model of comparable performance to the best black box models exists for many data sets. Rudin and her team use a theoretical argument and anecdotal experience to argue for this. Significantly, they call into question the widely-held belief there is always a trade-off between accuracy and interpretability. So, although it may be easier and less time-consuming to create a black-box model, chances are we could find an interpretable model if we tried.
    • Interpretable models should be the default type for any high-stakes decision. This is particularly relevant for decisions that deeply affect people’s lives, because not only are they often comparable to the performance of black box models but the basis underlying a model’s decision can be fully understood. Also, interpretable models are less prone to error, and any errors that do occur can be detected much easier and sooner.

    Why do we use black box models?

    Before we drill down further into interpretable ML, it’s helpful to remember there are several reasons why we might favour black box models. Essentially, it’s typically due to the fact it’s efficient to fit such models so they perform well. Unlike an interpretable model, which often requires considerable analyst skill, domain knowledge and feature engineering to fit, a black box model can process a lot of data, discover the key patterns and generate accurate predictions in a fraction of the time.

    In many settings, fitting a good model quickly is important:

    • Where customer behaviour changes rapidly and frequent model updates are required.
    • In automating the labour-intensive parts of a process, which frees up skilled analysts to have more time for understanding and taking a deep dive into the results.
    • To enable more regular updating of key financial information.
    • To increase a model’s speed to market, which can lead to better results for a company.

    In the insurance world, for example, black box models can be important in maintaining market position in personal lines pricing, and also provide a mechanism for monitoring outstanding claims more closely. They’re useful in marketing models for detecting patterns in customer behaviour or determining products of interest.

    In short, black box type algorithms such as gradient boosting machines or neural networks are extremely flexible and high-performing tools for fitting models to large data sets. It’s no accident that algorithms such as XGBoost are top performers on competition websites such as Kaggle.

    Black box models are particularly helpful when speed to market is important

    So, we should always build black box models?

    If only it were that simple. A problem with black box models is that we don’t know exactly why they do what they do (if we did, they would be interpretable). This is true even if we add an explainability overlay on these models – by definition these explainable components must be wrong some of the time, otherwise they would be the original model. Explainability techniques can identify trends in how the predictions change with various factors but there’s no guarantee this reflects what the black box model is actually doing. Troubleshooting black box models can be difficult. If we have an explainability overlay, in effect, we have two models to troubleshoot!

    Let’s consider two types of situation in which we build models:

    • Models that have very significant individual impacts (we’ll call these SII models.). This could be a decision to give an individual a mortgage, or a medical diagnosis relating to a serious illness.
    • Those that don’t (non-SII models). This might include things like marketing, pricing or recommender systems.

    A simple way to think of a model is that it categorises people into different groups and develops prediction rules for each of these groups. Then, for a new person not in the model, it works out what groups this person is most similar to and forms a prediction on that basis. If the model is good, this prediction will be accurate more often than not. But no matter how good the model, it will be wrong sometimes because people are individuals, not averages.

    For a non-SII model, being wrong often does not lead to highly negative individual outcomes, and a lot of the time the cost of being wrong is borne by a company and not the individual. For example, if an online entertainment streaming service starts recommending horror movies to me, I’m not going to watch them and it’s possible that I might cancel my subscription. In this case, the cost of the error is mainly borne by the streaming service responsible for developing the model that made the sub-par recommendation.

    However, suppose a model is used to diagnose a serious medical condition. This would be an example of an SII model. Here, the wrong diagnosis could be fatal where someone has a disease or lead to unnecessary testing and distress where they don’t. A particular issue in medical problems is leakage, where inappropriate data is used in developing a model. ‘Inappropriate’ here means something that would not be available in a true prediction problem. The concept of leakage is most easily explained with an example. Claudia Perlich recently won a Kaggle competition where she built a model to detect breast cancer from images. Her model was highly successful at predicting cancer, but this was due to the fact she discovered predictive information in the patient ID label which may have come from different machines being used for different severities of cancer. While the ID labels were highly predictive in the competition setting, they are unfortunately useless in real life. Identifying leakage can be tricky, particularly in a black box model where we don’t know exactly what’s going on.

    Another nefarious example is the influence of typographical errors in a model such as COMPAS, a proprietary model used in the US. This model combines 137 different characteristics to produce a recidivism score and is widely used in the United States to make parole decisions throughout the country’s criminal justice system. The model is proprietary and therefore a black box, irrespective of its actual structure, although various analyses suggest age and number of past criminal offences are important factors. Consider the life-altering impact then of a clerical error in one of these fields. Rudin and her team cite one such incident where a typographical error on one of the factors wasn’t discovered until after a person was denied parole based on a high COMPAS score. She maintains it’s much easier to make errors with complex black box models than simpler, interpretable models and that any errors in the simpler models are easier to pick up because the process is transparent and can be double or even triple checked.

    These examples caution the use of black box models in high stakes settings, but it’s also not the case that black box models can always be safely used for non-SII problems. Take the example of models used by companies to set prices. Frequently, these include an element of price elasticity modelling – which looks to estimate the trade-off between volume of sales and price of each unit. Typically, the lower the price, the higher the sales volume and vice versa, and somewhere in between the two extremes is where companies operate, balancing profitability and sales target figures. Sometimes, black box models get this relationship wrong and estimate the wrong effect when applied to out-of-sample data – sales increase as prices increase. Deploying this type of model would likely lead to poor commercial outcomes. An interpretable model avoids this – its transparency means that misfits like this should be apparent to analysts.

    This is an example of something that’s often a problem for many models, but particularly black boxes: they don’t extrapolate well to new regions of data. An infamous example is the husky vs wolf classifier, which seemed to have very impressive accuracy rates at distinguishing between wolves and huskies, until it was realised that the classifier was just identifying snow in images – the model was essentially: if snow, then husky, else wolf.

    No snow, must be a wolf!

    Black boxes bad. Interpretable machine learning models good?

    Yes and no.

    In an ideal world, we would always build interpretable models because we would prefer an interpretable model over a black box. However, constructing interpretable models is often computationally hard (although Rudin and her team as well as others are seeking to improve the interpretable model toolbox). Even when it’s possible to construct an interpretable model, there’s usually a requirement to spend much more time on feature engineering and selection – models require much more skilled analyst time and domain expertise and are therefore considerably more expensive to construct.

    There are many settings, particularly in the commercial world, where rapid reactivitity and speed of deployment are critical. For instance, the UK motor insurance industry is dominated by aggregator websites and operates on tight margins. Being able to react to changes and deploy models quickly is vital in that environment.

    On the other hand, particularly for SII models, the expense of constructing an interpretable model should be weighed against the consequences of getting things wrong for the individual affected by the decision. Take a model like COMPAS for example. Suppose the model assigns a high score of reoffending largely because of a data entry error where previous convictions were recorded as a ‘seven’ instead of a ‘one’. An interpretable model, which explains the seven previous convictions were a major factor contributing to the high score, offers some hope that this error could be corrected, unlike a black box model with no explanation. Encouragingly, Rudin and her team were able to construct a very simple rules-based model that depends only on sex, age and prior offences using the CORELS algorithm which showed similar performance to the COMPAS model (the proprietary model of 137 factors mentioned earlier). As well as being simpler to apply and less prone to error, it’s a much more useful starting point to consider how best to neutralise some of the inherent biases in the data underlying the model (for example, the evidence that, as a population group, young African American men are over-policed relative to their white counterparts).

    So why aren’t all models with significant impacts on individuals interpretable?

    Rudin and her team highlight several reasons for why all SII models aren’t interpretable. These include the costs of development, and possible difficulties in recouping model development costs for models that end up being a simple list of rules, or a scoring system, based on a small number of factors. From our experience, there’s no denying the considerably greater cost associated with this process.

    Furthermore, for something like the case of medical misdiagnosis, or the COMPAS example, there’s a system flaw in that the costs of being wrong are misaligned – in the case of getting a medical diagnosis wrong, the individual bears the cost of the mistake, not the company providing the algorithm. Dealing with this may require policy changes to encourage or demand greater interpretability in SII models, and greater accountability.  Recently, the New Zealand Government launched its Algorithm Charter for Aotearoa New Zealand which emphasises ethics, fairness and transparency but is restricted to government bodies. The EU’s General Data Protection Regulation has broader application and aims to give more control to individuals over their personal data, and a ‘right to information about the logic involved’ in automated decisions (i.e. it’s a ‘right to an explanation’). This is a step in the right direction, but, as noted above, explainable ML models and interpretable ML models are two different things and there is no guarantee that any explanation would be accurate. Furthermore, you have to know you’ve actually received an automated decision before you can seek your explanation. For example, if you were a woman not being shown job ads from Amazon because an ML algorithm was trained on data with only men in similar jobs, then chances are you would never know.

    What can we do when creating an interpretable machine learning model proves difficult?

    Difficulties in the creation of interpretable machine learning models may sometimes be mitigated by using black box models as part of an iterative process towards developing an interpretable model. For example, one model-building process (which we often use at Taylor Fry) involves using a black box model at a preliminary stage in an analysis to identify key features of interest. Based on this, we refine our selection of features to then build an interpretable model. This is frequently an iterative process, since black box models are useful for identifying un-modelled patterns in the data and, at each step, our understanding of the underlying data grows. Like Rudin and her team, our experience suggests interpretable machine learning models are frequently more useful than black boxes as the ultimate products.

  • Launch of the Australian Actuaries Intergenerational Equity Index

    Today the Actuaries Institute launches its Intergenerational Equity Index. Commissioned by the Institute, and developed by Taylor Fry’s Hugh Miller, Ramona Meyricke and Laura Dixie the Index and accompanying Green Paper show the gap between young and old has never been larger.

    Increasingly, actuaries are contributing to important public policy debates, addressing issues such as climate change, mental health and retirement. Most recently, the Actuaries Institute is looking to explore whether we are treating different generations fairly and providing opportunity to all. Intergenerational fairness involves understanding long-term, complex issues and is well suited to the actuarial mindset.

    About the Australian Actuaries Intergenerational Equity Index

    We have helped develop an Intergenerational Equity Index, and written a Green Paper Mind the gap – Australian Actuaries Intergenerational Equity Index . The Index tracks the wealth and wellbeing scores of three distinct age groups over time: 25-34 years old, 45-54 years old, and 65-74 years old.

    The wealth and wellbeing scores are formed from 24 indicators across six domains (Economic & fiscal, Housing, Health & disability, Social, Education, and Environment) and combined into a single score. The score tracks whether wealth and wellbeing are getting better or worse over time. Relative movements of different age bands provide insight into how developments over time are affecting various age groups differently. A widening of gaps between age bands can represent deterioration in intergenerational equity.

    The figure below shows the absolute index results.

    Absolute index results

    What does the Intergenerational Equity Index show?

    While some gaps between age groups should be considered normal due to natural life stages (for example, older people have had more time to accumulate wealth or buy a house), the striking result from the index is that the gap between the youngest age group (25-34, currently Millennials) and the oldest (65-74, currently Baby Boomers) has never been larger. This suggests we’re experiencing growing intergenerational tensions, as exemplified by the ‘OK Boomer’ meme seen in 2019.

    The paper itself is well worth a read for full details of how the index is constructed and what drives the results. The individual indicators show interesting trends even before being aggregated into the index. Here we highlight just a few items:

    • Real household net wealth has grown about 90% for the 65-74 age group over the past 15 years, compared to 20% for the 25-34 age group. The gains from large increases in asset values, including housing, has advantaged older generations most.
    • The proportion of 25-34-year-olds owning a home has fallen from 51% to 37% in the 17 years to 2018. This compares to a seven percentage point drop for the 45-54 age group and a one percentage point drop for the 65-74 age group. Housing affordability has worsened significantly, despite record low interest rates. It now takes longer to save for a deposit and house prices, particularly in the bigger cities, have grown far faster than incomes.
    • Environmental issues, a commonly cited intergenerational issue, have dragged down the index for younger people. Larger temperature anomalies, more CO2 in the atmosphere and lower rainfall are all issues that are likely to continue to worsen.

    However, it’s important to remember intergenerational issues are not solely about the problems younger Australians face. The Royal Commission into Aged Care Quality and Safety has highlighted the failure of society to allow the elderly to live with dignity in their final years. And poverty rates are high for subgroups across different ages, notably for pensioners who do not own their own home.

    Why is it a good time to be talking about intergenerational equity?

    The COVID-19 pandemic has brought some intergenerational issues into sharp relief. Rates of unemployment, and underemployment, are almost always higher for younger people but the recent spike has disproportionately affected younger people. Increased government debt will take many years of fiscal restraint to manage down as a proportion of GDP. Many people, particularly younger people, have drawn down on their superannuation early, which will have downstream consequences for wealth accumulation and standards of living in retirement.

    The Green paper identifies existing policy proposals, across each of the domains, which could help improve intergenerational equity. For example, switching from stamp duties to a land tax should encourage more efficient use of housing supply. Additionally, a more concerted effort to limit carbon emissions, in partnership with the global community, will reduce future anthropomorphic climate change.

    You can read more about the Index, and details here and the media release here.

  • Modelling the spread of COVID-19 in Victoria

    As Australia tackles its second wave of COVID-19 infections, and deals with the resulting economic impact, it is more timely than ever to understand how effective restrictions are at slowing the spread of the disease. Dr Anna Cohen and Dr Gráinne McGuire explain how recently published modelling techniques can help explore crucial ‘change points’ in the rate of community and local transmission of COVID-19 in Victoria, and what we might expect following the latest round of restrictions.

    Earlier this year, German researchers (Dehning et al) published a paper Inferring change points in the spread of COVID-19 reveals the effectiveness of interventions in the journal Science. The researchers’ model looked at reported COVID-19 cases in Germany in the early weeks of the pandemic alongside publicly announced interventions. Combining a standard epidemiological model of disease and Bayesian inference, they were able to quantify the impact of publicly announced interventions to stop the spread of COVID-19. The researchers found the three successive interventions – mild social distancing, strong social distancing, and contact ban – did have a significant impact on the spread of the disease.

    In the spirit of science, Dehning et al made their results and modelling code publicly available, to encourage researchers to adapt their findings to other countries or regions. With Victoria’s second wave of COVID-19 infections reaching the highest daily figure total for the pandemic last week, and a spate of measures over July to bring the disease under control, we explore how this research can shed light on the impact of changes to social distancing measures on the spread of the disease in Victoria.

    What cases do we look at

    In the rest of this article we look at transmission of cases acquired locally, which means:

    • Acquired from someone who acquired the disease while overseas (more common in the earlier stages of the pandemic’s trajectory)
    • Acquired from someone who is a confirmed case of COVID-19
    • Community transmission, where someone has been infected by the disease, but hasn’t been overseas or hasn’t been in recent contact with other confirmed cases.

    We don’t look at cases acquired overseas as these cases are more representative of the COVID-19 spread around the world, rather than the local transmission environment.

    Victoria’s experience with community and local transmission

    In the figure below, we show Victoria’s daily reported cases excluding those acquired by overseas travellers since March (data as at 9 August).

    Daily reported locally acquired cases of COVID-19 in Victoria

    We see:

    • An initial sharp increase in cases, peaking at the end of March following initial restrictions put in place to limit indoor and outdoor public gatherings
    • Daily cases declining over April, as stage 2 restrictions came into force on 25 March, and stage 3 restrictions from 30 April
    • From late April onwards, an upturn of cases, with many linked to clusters including the Cedar Meats abattoir (with the first case linked to the abattoir diagnosed on 2 April) and McDonald’s in Fawkner
    • Daily case numbers declining slightly over May, and following the easing of stage 3 restrictions, a steep increase in case numbers from mid-June onwards.

    What did we find when we modelled Victoria’s COVID-19 spread?

    Drawing on Dehning et al’s approach, our modelling takes a retrospective look at COVID-19 experience in Victoria over the period 1 May to 3 August, by measuring the rate of change of COVID-19 between key dates such as easing or re-introduction of restrictions.

    In the figure below, we plot the effective reproduction number as estimated by the model, and measure the change in the reproduction number between key dates, such as easing or re-introduction of restrictions. The effective reproduction number is a measure of how many people an infected individual can pass the disease onto and it measures the trajectory of the disease. Reproduction numbers:

    • Above 1 are indicative of a growing epidemic
    • Below 1 correspond to decreasing numbers of new infections.

    Effective COVID-19 control strategies reduce the reproduction number so that it is below 1. When the reproduction number is 1, new cases remain at a constant level.

    Effective reproductive number in Victoria 1 June – 3 August

    Our modelling suggests the spread of the disease:

    • Started accelerating around the end of May
    • Continued to accelerate through the first three weeks of June, coinciding with the removal of restrictions on 1 June (restaurants and pubs were allowed to reopen with 20 patrons each, combined with the removal of restrictions on family gatherings two weeks prior)
    • Slowed down following the reintroduction of restrictions on family gatherings on 22 June, but only to a point
    • Slowed down further following the progressive lockdowns of Melbourne and Mitchell Shire between July 2 and July 9 but still remained above 1
    • Reduced further from 23 July following the introduction of mandatory masks for all residents of Metropolitan Melbourne and the Mitchell Shire. By the end of July we estimate the reproduction number as being close to 1.

    Once an intervention is introduced, the reproduction rate can take a number of days to reach its new level. In the table below we show the estimated reproduction rates that are seen after key intervention dates over June, July and August.

    Intervention dateSignificanceEstimated reproduction number following the intervention
    1 JuneRemoval of restrictions on restaurants and pubs1.33
    7 June1 week after removal of restrictions2.14
    14 June2 weeks after removal of restrictions2.57
    22 JuneRe-introduction of restrictions on family gatherings2.05
    2 JulyLockdowns on key postcodes in Melbourne1.70
    9 JulyLockdown of Melbourne and Mitchell Shire1.41
    16 July1 week after Melbourne and Mitchell shire lock-down1.43
    23 JulyIntroduction of mandatory masks1.00

    The figures in this table correspond to those in the graph above with approximately a three day delay after each intervention or measurement date. E.g. the estimated reproduction rate from the 23 July intervention of masks reaches its level of 1.00 around the 26th June.

    The model results suggest that the introduction of masks was sufficient to bring the epidemic either under control (i.e. a reproduction number of 1), or close to it, but crucially was not enough to start to reduce the number of new cases.

    What’s happened since?

    As of the date of writing, we have plotted five further days of reported COVID-19 cases (up to 9 August) after the end of the time period covered by the model (up to 3 August). Reported cases on 4 August were much higher than forecast, but numbers on 5 – 8 August are more in line with the model forecasts, in particular, that case numbers are stabilising.

    Actual and modelled daily reported locally acquired cases of COVID-19 in Victoria

    What’s next?

    It’s still too early to tell the impact of the most recent measures to control the spread of COVID-19 in Victoria, in particular, the declaration of the State of Disaster and the introduction of Stage 4 restrictions in Melbourne on 2 August and Stage 3 restrictions in regional Victoria on 5 August. Modelling of the latest data will allow us to measurethe impact on the reproduction rate of  further restrictions.

    A bit more on how we did it

    The methodology outlined in the paper Inferring change points in the spread of COVID-19 reveals the effectiveness of interventions combines a standard epidemiological model of disease (the SIR model) and a list of intervention dates that result in changes in underlying epidemiological parameters change. The researchers used a Markov Chain Monte Carlo method to estimate various model parameters at different points in time, and the uncertainty in these estimates. We have replicated this work, and importantly:

    • We have excluded cases identified in returned travellers in quarantine, as reported in the Victorian Government’s daily briefings
    • We have assumed an approximately eight-day delay between infection and identification. This consists of the total time of incubation period, onset of COVID symptoms sufficient to warrant a test, and getting and receiving test results. Roughly speaking, this means we assume cases notified today relate to infections that took place, on average, eight days ago. Since the modelling approach uses simulations, a range of delays, centred on eight days, are used in practice.