To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Equivariant cohomology has become an indispensable tool in algebraic geometry and in related areas including representation theory, combinatorial and enumerative geometry, and algebraic combinatorics. This text introduces the main ideas of the subject for first- or second-year graduate students in mathematics, as well as researchers working in algebraic geometry or combinatorics. The first six chapters cover the basics: definitions via finite-dimensional approximation spaces, computations in projective space, and the localization theorem. The rest of the text focuses on examples – toric varieties, Grassmannians, and homogeneous spaces – along with applications to Schubert calculus and degeneracy loci. Prerequisites are kept to a minimum, so that one-semester graduate-level courses in algebraic geometry and topology should be sufficient preparation. Featuring numerous exercises, examples, and material that has not previously appeared in textbook form, this book will be a must-have reference and resource for both students and researchers for years to come.
A set of vertices in a graph is a Hamiltonian subset if it induces a subgraph containing a Hamiltonian cycle. Kim, Liu, Sharifzadeh, and Staden proved that for large $d$, among all graphs with minimum degree $d$, $K_{d+1}$ minimises the number of Hamiltonian subsets. We prove a near optimal lower bound that takes also the order and the structure of a graph into account. For many natural graph classes, it provides a much better bound than the extremal one ($\approx 2^{d+1}$). Among others, our bound implies that an $n$-vertex $C_4$-free graph with minimum degree $d$ contains at least $n2^{d^{2-o(1)}}$ Hamiltonian subsets.
Let $\mathcal{F}$ be an intersecting family. A $(k-1)$-set $E$ is called a unique shadow if it is contained in exactly one member of $\mathcal{F}$. Let ${\mathcal{A}}=\{A\in \binom{[n]}{k}\colon |A\cap \{1,2,3\}|\geq 2\}$. In the present paper, we show that for $n\geq 28k$, $\mathcal{A}$ is the unique family attaining the maximum size among all intersecting families without unique shadow. Several other results of a similar flavour are established as well.
We prove that for every tree $T$ of radius $h$, there is an integer $c$ such that every $T$-minor-free graph is contained in $H\boxtimes K_c$ for some graph $H$ with pathwidth at most $2h-1$. This is a qualitative strengthening of the Excluded Tree Minor Theorem of Robertson and Seymour (GM I). We show that radius is the right parameter to consider in this setting, and $2h-1$ is the best possible bound.
We study the locations of complex zeroes of independence polynomials of bounded-degree hypergraphs. For graphs, this is a long-studied subject with applications to statistical physics, algorithms, and combinatorics. Results on zero-free regions for bounded-degree graphs include Shearer’s result on the optimal zero-free disc, along with several recent results on other zero-free regions. Much less is known for hypergraphs. We make some steps towards an understanding of zero-free regions for bounded-degree hypergaphs by proving that all hypergraphs of maximum degree $\Delta$ have a zero-free disc almost as large as the optimal disc for graphs of maximum degree $\Delta$ established by Shearer (of radius $\sim 1/(e \Delta )$). Up to logarithmic factors in $\Delta$ this is optimal, even for hypergraphs with all edge sizes strictly greater than $2$. We conjecture that for $k\ge 3$, $k$-uniform linear hypergraphs have a much larger zero-free disc of radius $\Omega (\Delta ^{- \frac{1}{k-1}} )$. We establish this in the case of linear hypertrees.
We study two models of discrete height functions, that is, models of random integer-valued functions on the vertices of a tree. First, we consider the random homomorphism model, in which neighbours must have a height difference of exactly one. The local law is uniform by definition. We prove that the height variance of this model is bounded, uniformly over all boundary conditions (both in terms of location and boundary heights). This implies a strong notion of localisation, uniformly over all extremal Gibbs measures of the system. For the second model, we consider directed trees, in which each vertex has exactly one parent and at least two children. We consider the locally uniform law on height functions which are monotone, that is, such that the height of the parent vertex is always at least the height of the child vertex. We provide a complete classification of all extremal gradient Gibbs measures, and describe exactly the localisation-delocalisation transition for this model. Typical extremal gradient Gibbs measures are localised also in this case. Localisation in both models is consistent with the observation that the Gaussian free field is localised on trees, which is an immediate consequence of transience of the random walk.
We have covered a great deal of ground in this book, and the diversion into physics and what we can learn from it may have surprised some readers – but why reinvent the wheel? Other disciplines such as physics have been around for much longer and are more formalised than where we find ourselves in data, so why not learn from them and from other professionals? We feel that when we started writing about data we were still in the Wild West stage of formalising data leadership and what it means for organisations. Time has definitely worked its magic, this area has moved on so fast and so many wonderful voices have joined in the conversation that the idea of using data as an asset and what that means in organisations have both developed considerably. That can only be a good thing!
We hope that we can continue to challenge ourselves in this discipline to learn from others both within the data space and also outside it, because then we can all become better.
We thought long and hard about what to call this book and finally decided on the title Halo Data because of its application to what we are all doing. Data hasn't changed, but hopefully this book will give you a different way of thinking about it that helps you. Halo data plays such a large part in championing the role that metadata and the ‘distance’ from the core data can have in using data and how it is described. The paradigm shift is about unlocking value. It isn't about data being the new whatever: it is about data being data and how it delivers value to the organisation.
Just thinking about data in a different way wasn't enough for us, because we also had to go through how you make it practical. If you don't use it, why bother collecting it in the first place? If nothing else sticks from reading this book, just remember that using the data to solve a problem or create value is what really matters.
The value proposition and the paradigm shift bring ethics into sharper focus because, while data can take us to new, exciting and innovative places, it can also take us into new, darker places.
The simple answer to the question ‘what is metadata?’ is ‘data about data’. But that answer conceals so much rich detail about metadata. A bit like the analogy of the iceberg, the answer gives you the tip, but not the deep understanding of metadata that data professionals need in order to address the bigger question of data value.
Metadata has some early origins in the world of book publishing: who wrote it, when was it written, who published it and when, the territorial rights for publishing, who holds the copyright, how many chapters it contains, the subject matter and so on. If you look at a listing of a book on Amazon and scroll down a little you will see the metadata; in fact, the whole Amazon page is built from metadata about the book. The book is the data point; all the other information on the Amazon web page is metadata. Here is part of the metadata from Amazon for our first book, The Chief Data Officer's Playbook:
Best Sellers Rank: 9,651 in Books (See Top 100 in Books)
26 in Data Warehousing (Books)
1 in Knowledge Management
81 in Beginner's Guide to Databases
Customer reviews:
4.2 out of 5 stars 75 ratings
Astonishing! It even goes down to the physical dimensions of the book. All of these pieces of metadata are important to different groups of people who probably have very different motivations and needs. The shipper or carrier is interested in the physical size of the book, the stockist is interested in the language, the foreign rights publishers too are interested in the current languages, the library and retailers are interested in the ISBN and prospective readers are interested in the customer reviews.
Metadata is information about the content that provides structure, context, and meaning.
(Rachel Lovinger Metadata Workshop, 1 March 2012)
It serves to make it easier for others to discover, assess and utilize a dataset. Discovery and use are self-explanatory, but what is assessment? This term covers all the information that might be useful in determining whether or not one can or should use the data. It answers questions such as, does the data come from a trustworthy source?
There is much chatter about personal data and the accompanying legislation that is in place to protect both it and us. Whether that is the European General Data Protection Regulation (GDPR); the Data Protection Act 2018 (UK); Canada's Digital Charter Implementation Act; or Japan's Act on Protection of Personal Information, our governments are taking the protection of our personal data seriously, and this can only be a good thing. While the USA doesn't have a data privacy law applicable to every American state, each state does have its own law, such as the California Consumer Act (CCPA) – which is important, as California has a larger population and annual GDP than a good number of countries. The USA also has data protection provisions in the Health Insurance Portability and Accountability Act of 1996.
This type of legislation isn't new, and regulations about how we could use personal data predate all of the above examples; however, we just weren't taking it seriously. It wasn't until the consequences and awareness of what was happening were raised that people seem to have woken up and decided that legislation regarding the collection, storage, processing and use of personal data needed to be taken seriously.
What many people don't realise is that this type of legislation only ever and should only ever act as a last line of defence. We should be choosing to do the right thing because it's the right thing, not because we will be penalised if we don’t.
This is where data ethics come in. There will be something of a circular discussion here, but it's important to understand the circle and why it exists.
There is (unfortunately) example after example of why we need data ethics – or as we like to put it, of when good data turns bad, as in the following:
• chatbots which have to be pulled from service after they start making racist comments based on biased data they have picked up;
• blindly following artificial intelligence (AI) decisions without understanding the implications or biases behind them;
• toys collecting data on our children. The consequences of these type of actions dictate the necessity for data ethics.
Before we get into what value is or should be, we need to understand why it is important. It is, or should be, obvious why it is important for the data folks (especially the data leaders within the organisation) to be able to demonstrate the value of data, because it is intrinsically linked to proving the value of their work. Everyone who is part of an organisation has to be able to prove their value – that's what objectives are all about.
However, when you are responsible for improving how data is used and driving value from it, then, because data underpins almost all business decisions, you will have a significant impact on the future direction of the company, how well it does, adapts and even thrives based on the decisions it makes. When the data leaders or CDO can't demonstrate value, then you won't get the investment needed for the organisation to work with data effectively, efficiently and creatively.
There are very real consequences for an organisation if it does not utilise data to its fullest extent.
• Time and effort are constantly wasted, as the organisation either is overwhelmed by data or struggles to make decisions because it has no idea which data is correct and which is as smelly as yesterday's fish.
• People get stuck doing repetitive tasks that should be easy to automate, and they end up bored and demoralised.
• Problems with data just grow. We have tried this before and nothing works, so why would it work if we tried it again? The problems with data and its use just become more convoluted.
• Often, organisations dig themselves deeper into a hole by making short-term fixes or ‘improvements’ so as to get the answer or result that they need now.
• And the list goes on.
But when the narrative changes and data is seen as a value creator rather than a drain on resources and a bottomless pit, then you can get
• the right kind of resources and effort focused on increasing the use of data to positively change the direction of the company;
• better engagement with people across the organisation as they focus on the interesting things rather than the monotony of manual intervention;
• more meaningful results and better decisions, leading to accelerated growth and creating a virtuous circle;
There is no point in all of the theory and explanations that we have covered so far if we can't put them into practical application. We began by saying that we are interested in solving problems and problems aren't solved by inaction. Without a focus on understanding, managing and using your data you will never be able to properly use the power of data to transform your organisation.
It can be really hard to decide what is the right thing to do next, or where to start. This could be a massive understatement in the current data and business environment where things are happening at such a fast pace. Where do you go first, what do you need to do first? Organisations that aren't constantly looking to improve, reinvent themselves or even just look at themselves will only slide backwards against increasingly competitive markets. The technology that we use around data, the science, the art and the processes involved are constantly changing at an astonishing rate. We need to make sure that we’re not just keeping pace with everybody else, but that what we’re doing is using data to keep us at the forefront of where we want to be. The drive for organisations to digitally transform and the radical rethinking on how enterprises want to use data are boosting how much they can change. It is happening at such a dizzying pace. The drive for agile thinking leads us on to more than just an agile project, but an agile way of life. Remember, it isn't just good enough to give people faster horses; in what we’re trying to do we need to really think about the end goal, and to make sure that we achieve it.
At the heart of thinking how your organisation will change itself for the better, or radically rethink its digital and technology capability, you need to think about the data first. Ultimately, the success of any kind of digital transformation demands three very critical elements: people, data and process. In all these things we need to have a high degree of trust so as to be able to move forward.
Data is a business problem, so it's a business problem to solve.
In our book Data Driven Business Transformation we emphasised the importance of managing risk. Data has to be part of the overall risk process, and you need to think beyond the limits of just regulatory and legislative risk. There are a number of elements to truly understanding data risk.
• Definition of information risk: This should be a clear, concise description that explains the risk so that anybody reading it can understand it. Avoid jargon and acronyms. Would somebody outside your organisation be able to understand the risk as you have defined it?
• Early-warning indicators: What metrics will you put in place to indicate that you’re moving into a danger zone? Assign indicative tolerance levels to your early-warning indicators, and monitor them to ensure that they are doing the job right, and modify them if necessary.
• Causes: Look at both internal and external factors, remembering to include competitive elements, any change in demand force, better use of technology, human risk, changes in both internal and external control and the potential for mismanagement.
• Risk assessment: Safety, performance, finance and reputation (political) overall. Your organisation should have a numbering system for assigning assessed risk levels. Tie in with the corporate risk assessment, using the same system for your data information risk assessment.
• Risk assessment rationale: This is the ‘why’ section. Document the rationale for your assessment of the impact or probability of each risk. This will help to communicate the significance of each risk and place it into context with other organisational risks, so that the organisation can appropriately tailor their assets and resources relative to the likelihood of each risk happening.
• RACI (responsible, accountable, consulted and informed) around risk: For each different risk area you should understand who is responsible versus who is accountable. Whose opinion do you need to ask and who do you just need to keep up to date with what is happening?
• Existing controls, causes and consequences: Look for current controls within the organisation that you can use to monitor each risk. Are other processes already happening that the organisation needs to complete that will impact on the risk? Be honest.
• Improvement actions: This covers two areas: (1) try to stop the risk happening in the first place and (2) try to minimise the impact of the risk if it does happen and there is no way of stopping it.
What was happening before data professionals arrived?
Until recently IT (information technology) has owned the ‘data terminology’. With the rise of the CDO (chief data officer) and organisations wanting to increase the value they derive from their data, we need to think about data in new ways. Data is now a discipline in its own right outside of IT; it is growing up, but isn't yet fully mature. Huge technological steps have been taken, but some fundamental thinking has been omitted. People are trying to organise and govern their data, but they are struggling to get or identify a Return on Investment (RoI) on that activity and to truly release the value of data. We need to find a way to accelerate data science and analytics so that we’re not just looking at their potential but have realised their benefits which will allow organisations to do more with data for less cost.
No doubt we all still read many articles and posts about data, and hear the data community discussing the problems encountered and created by data being a subset of the technology domain within organisations. This is a constant and ongoing discussion that has a commonality across vertical markets and geographies. Even if the data team aren't actually a subset within an organisation's IT department, in many cases that perception exists; and even if it doesn’t, many problems may persist, due to data having previously been a subset of the tech - nology domain. What do we mean by ‘being a subset of technology’? We mean that essentially the CDO reports up to the CIO (chief information officer) or CTO (chief technology officer) or that data was part of the IT or technology teams reporting to the CIO. Even if the reporting lines aren't hierarchical and the CDO sits alongside the CIO/CTO, this sustains the perception of a subset.
We have been calling out this issue for a long time; indeed we wrote about it in our first book, The Chief Data Officer's Playbook, in 2017. At the risk of covering some of that ground again it is worth repeating a few points. If the CTO and the use of technology were going to ‘crack the data problem’ and ‘leverage the power of data’ then surely, after decades of the CTO/CIO role being established in organisations, this would have been achieved by now?
Data storytelling is the process of translating data analyses into understandable terms in order to influence a business decision or action. Data analysis focuses on creating valuable insights from data to give further context and understanding to an intended audience.
With the rise of digital business and data-driven decision-management, data storytelling has become a skill often associated with data science and business analytics. The idea is to connect the dots between sophisticated data analyses and decision-makers who might not have the skills to interpret the data.
(Alexander S. Gillis and Nicole Laskowski, ‘Data Storytelling’, Techtarget, December 2022, www.techtarget.com/searchcio/definition/data-storytelling)
Why storytelling is needed in data
Gillis and Laskowski provide an excellent definition of data storytelling and why it is important. Often we use data to drive decisions, create action or influence opinions and positions. This may be a decision about a performance marketing strategy, about a financial investment or transaction, about how much raw material to buy or how fast to run the engines. As data professionals we all know and hope that decisions are based on data and enabled by data, if not actually data driven. One problem that we have, however, is that not all decision makers are sufficiently data literate to understand or interpret the raw data that is put in front of them, either in the dreaded spreadsheet or on the equally dreaded ‘dashboard’. It is an even greater leap to expect business decision-makers to understand the output from a ML model or indeed the model itself or how it was trained or how it is calibrated. To get around these issues data leaders have increasingly realised that they need to be interpreters or narrators so as to make the data understandable and to enable engagement with insights. Scott Taylor, The Data Whisperer, is one of the leading practitioners and evangelists for data storytelling. It is all about connecting the dots, creating the narrative that brings the data alive to impact on decisions and actions. We often talk about ‘actionable insights’, but if the insights don't create action then they are of little value.
What we are talking about here is using the power of data storytelling to influence operational decisions for the course of the business.
That, however, is only one aspect of the storytelling. Often data leaders need to create the narrative that builds the connection between a data activity and a business outcome.