Node on the Edge : May 2017

Aim

I am a Node on the Edge because I cannot claim to have all the answers or a decade long history of experience but I can help to define questions to be solved. Hopefully their is a potential finite set of questions that allow most requirements be solved and my aim is define these with collaboration of the community.

It would be amazing if these questions developed into some kind of NP Complete space where they all referred to one question such in NP Complete Theory or Monty Pythons Holy Grail but that is/maybe asking too much?

Friday, 19 May 2017

Node observing two collaborative Edges - From ARM to Machine Learning

This was a comparison of human and ML ability, a history lesson in innovation and a exposure to a significant collaboration of Hermann Hauser and Steve Furber. This collaboration proves and has proved many things. From Acorn computers to ARM this country and industry would be far shrunken without this. From the time when Hardware ruled and Software was something that displayed the capabilities of the hardware performance abilities. To as Hermann put it a somewhat unheard of and unique case of commercialising a technique, reduced instruction set computer RISC chip manufacture design, largely started or defined in the US while capitalising on it in the UK and Europe.

It is still despite the longevity of their collaboration obvious to observe Hermann's glee, exasperation and surprise having found a perpetual motion magic box that is the output of Steve's questioning and relentless quizzing of his own, along with others concept of what we know. According to Steven this is something very little. Hermann has demonstrated how this little has been sufficient for us to universally understand, simply simulate and replicate repetitively our knowledge. In their collaboration this has been on a low powered piece of silicon. Having commercialised the RISC into encompassing a large majority share of the mobile chipset market through founding ARM their experience cannot be measured, their expertise are uniquely varied and highly applicable to new advances.

It is the aim of this gathering is here to debate the Artificial Intelligence commercialisation. So when we debate AI we assume it is an attempt to understand, simulate and replicate our knowledge and our intelligence. This what we can do.

Yet there maybe greater meaning, forces and mimicking that we lose when we attempt to understand, simulate and replicate our knowledge and our intelligence. We, maybe can, replicate procedures of intelligence. Is it a cat? It has some feet and looks similar to others so it is. What is a cat? It's a definition. How can we be sure this is not another definition we' ve never heard of? There's no truth and according to Hermann there is no truth or false and there is no certainty. There is just probability. This is a fundamental change from pre designed procedures for every situation to a probability replacing every truth we thought we knew. It is here that Steve's perpetual questioning is so beneficial, something I have been on the other end of. Receiving, most likely automated, e-mails about Steve's University course that there was an open ended questioning session instead of traditional lectures including that all notes were online and any questions unique from the previous year would be answered then. This does highlight our own understanding of truth and false that if we agree something is true we don't question it. So being able to determine every concept, object and action we can therefor ultimately understand, sophisticatedly simulate and rudimentarily replicate our knowledge and intelligence. This is what Alan Turing of imitation game fame and invention of the modern Computer fame defined that if there is a problem that can be deterministic I.e we can determine the answer we can simulate it given the processing power to do it. So for problems with some uncertainty that we cannot know the answer of say stock price in a year or earthquake imminence their is no determinism. So it was established these problems cannot be simulated and our universal unification through understanding that uncertainty was indeed uncertain was ingrained.

This is such a fundamental change because Machine learning allows some of these uncertain problems to be solved to a certain accuracy. At a certain accuracy as Steve pointed out it does not matter if you can only be 99% certain something is true. Identify 100 cat's and one a mouse you still know what a cat is or identifying your cat 100 times and assuming its the neighbours cat is yours every three months is forgivable. So exchanging this determinism of true or being false with a probability of true is what Machine learning does. An expert estimate is better than no expert estimate. The expert clarification being the level of accuracy of the probability. This is a sometimes over looked criteria. Identifying a cat is important if you are in a pet shop purchasing a cat because walking out with a parrot when you went to buy a cat is only a benefit when re telling to ones grand kids much later on. Knowing if the cat maybe could get a disease with a 50 % probability can be a benefit to know to buy insurance. So just replacing true with a probability of true it's important to recall we need to know how true sometimes. It's also important to recall something is never true and just an assumption that is a convenient truth. It gets to an interesting question that if we can determine uncertainty so we can ignore it and understand, simulate and replicate this intelligence in ML that ignore uncertain or do we want an AI to remove all uncertainty. We discovered that we can automate, evaluate and repeat problems that were deterministic. We could potential repeat until all uncertainty is removed. This is very much a future question. It does highlight Steve claim we haven't moved on much. We can just imitate recognition but there are much higher levels of intelligence that operate in high levels of uncertainty, ignore it and remove it.

These could be a benefit although what we have is already a benefit to lots.

This topic of uncertainty highlights the uncertainty in their joint ventures of that it was uncertain in ARM Holdings what advancement was going to be the success.

The initial collaborations start was with Acorn Computers which had success and fame from BBC micro in education. Ultimately there were some uncertainties that could not be ignored. There are debates over what these were and not completely described by a massive drop in the market, company acquisition deal and large overheads. According to another article the value from Acorn was the basis for three titans Acorn, Apple and VSLI that went on to fund a new entity. That value that had given the earlier success and was still there but not visible was ARM which was the RISC chipset that powered the processing referred to by CPU which is in every PC and mobile device. This was the new entity called ARM Holdings a reverse take over of Acorn. The ARM RISC chipset was in a joint venture with Apple which had shares in ARM and used its RISC chipset. When apple was in decline and Steve Job returned it is detailed in Walter Isaacson's book on Steve Jobs that it was the selling of ARM stock in the late 90's that helped give the extra funds for the reinvention of Apple.

Acorn could not be saved from its uncertainties and the third titan VSLI used their RISC chipset and had success.

ARM Holdings in the early 90's only had 12 employees, a revolutionary product and bitter experience.

Its from here Hermann with his still obvious extonishment of the hidden simplicity in the business model success explain similar to many of it contemporaries ARM is not what it's seems. He states his favourite line "we have never sold a RISC chipset. ARM Holdings do offer licenses although. " A small company had no chance to make a chipset and commercialise it even if it had design that were highly advanced. It is here that their collaboration becomes so apparent making the value, finding the most opportunity in this value, finding the market for it and approaching this market with an optimised offering. Steve managed to create this value many times. Hermann knew the Market and how to approach it and stay in it.

Their individual success is actually very similar and can by stated by less for more. Steve squeezed more processing for less power and size. Hermann gave less of the value in just a license not a product for more profit. This is a lesson for all in a large industry and one dominated by large players. There were chipsets creator who were very effective and advanced in doing that and ARM Holding allowed them to do it more efficiently and gain new advancements. Where it's competitors i.e. Intel relied on creating chipsets themselves. It may not have been a large disadvantage for Intel to produce their own but by selling a license instead of product allowed a small company be successful. It is claimed that their collaboration did not schedule its approach to market or management. The failure of Acorn highlights this although it was not critical, better to make industry ventures, evolve for the market and ultimately it was evolved when the market demand was there for RISC licenses. The success of understanding the chipset main value in mobile devices was its largest advancement and meant it's almost immediate roll out by Texas Instruments in Nokia devices which dominated demand at the beginning of the mobile era and led to 95% of the RISC chipset market share being dominated by ARM Holdings and poised to continue this.

This is some way from the latest ML the topic of the debate. A RISC chipset just can do a few procedures maybe calculate your weekly spending you have. Although this RISC chipset has changed and ML is not just about doing many procedures. This is because when many of these RISC chipsets are used and in a structure or network inspired by biology it can do ML much easier. This is demonstrated by Steve 's research project Spinnaker which is replicating 1 % of the brains power and neurones.

Hermann explains that ML is a change a significant because it makes new business models. It does stuff once done by someone can now be automated. ML according to Hermann is significant because it has new business models and can be done in so many more industries. The automation of any recognition from behaviour in Uber by FiveAI a company linked to Hermann to object recognition. Hermann correctly identifies that when this happened before there have been more job creation than losses although those who lost jobs were not those who gained the new jobs. This was again highlighted by a question if we are preparing for this transformation with Steve's answer a simple No. Despite the apparent certainty of ML and its advantages the there still lots uncertainty that's not known. Hermann agrees and describes the perpetual ability of humans to adapt to this uncertainty that makes it possible to incorporate ML to our advantage.

So can ML be used to reduce and maybe remove uncertainty i.e predict everything. Can hardware be biologically inspired similar to the Spinnaker project that Steve is leading. Can FiveAI reduce the margin in the probabilities so to automate decisions to a high enough level on decisions that do not matter to us so we can reduce our uncertainty. Maybe quantum computing devices can cope with the reducing of uncertain can we cope with not having the uncertainty.

The answers to these are very uncertain.

There of course are lessons to learn other than uncertain reduce is uncertain from this debate and it's panel. That this collaboration is heavily routed in one city Cambridge. That the success of something is very delayed. There are many giants which dominate sectors but innovation helps to avoid competition and conflict with them. Identifying the value in your offering is critical and giving less of it for more gives longevity. This is displayed by ARM Holdings 450 clients purchasing licenses giving more revenue to ARM than all of intel's revenue in total. The giant usually have some intrinsic method that defines them or is so incorporated they often choose to continue with it through brute force. Innovation usually cannot break this. This intrinsic method is their limitation, degradation or continuation of this giant. Innovation can move around this, facilitate and have other business models to become bigger than the giant.

Meaning of Edges

Meaning is just what you perceive and to some Scottish People Canny means you can't do something or you're not capable of doing something and to other Scottish People and everyone else it is some one who is so canny they found a way they could do something. So to those canny wee folk who cannily found a way to do what they thought they canny do don't forget that the step from canny to cannily doing is not so large.

Team Node - How to form your team of Edges

One of the fundamental of Data Science is the team. Whether you prefer your teams more A- Team or more regional chess team the experience of being propelled and enhanced by your teammates is something anyone who has been in a successful team cannot deny is a benefit. So how do successful teams operate, form and how to be in one or form one. We all know successful teams whether the one were in or ones we are fans of.

Forming that team is a skills not often recognised. How to form and who to include are sometimes the most difficult to do. There are memorable strategies one of which is entitled forming, storming, norming, performing and adjourning. The forming is similar to a canopy reception you do forming through collecting your name badge and meeting a few attendees around a table. This is before the storming begins. Then the canopies arrive a few make polite indication they are listing while reaching out to grab the parsing canopy other chase the canopies around the room. Other rush over to the canopies. Other quietly wait until halfway until the caviar is there. This is the storming where egos occur, teams iron out their conflicts and teams start to gel. The norming is the conversation that the egg pastry was the one canopy to get. It is similar to being not being able to get on the metro and complaining to the others that the metro is decrepit. It is the gelling of team members over something. This bonds teams. The next is the performing when you someone else joins your table and you instantly perform in team to ask questions. The last is the adjourning when the networking session ends.

So we know doing data science needs a team and a team with a variety of skills. So how do you setup a strategy to form this team or be in one. There are some unifying things that you can isolate in most teams. These are, suggested by an MIT report, A shared understanding of the mission and commitment to goals, clearly defined responsibilities and roles, agreed ground rules, a decision making model, effective communication, mutual and self evaluation.

When you have all of this and how do form Data Science a team to come up with insights that propel your team to legend status.

To give you an example we are joined by a strategist and leader of a new data Science within the NHS. Ken Nicholson.

Where you do you find yourself in the NHS or what is your role and department

My title is Principle Information Analyst within the information services division of the NHS known by NHS NSS.

It is reporting information and insight to NHS boards.

What does your organisation do for the NHS and what does it provide

We provide management information in terms of Health Analytics and strategic management insights.

Mainly statistics to determine strategy anything that allows the NHS to more optimally manage its operations.

We provide information to NHS health board and NHS staff. We started with statistics then on to logistic regression and decision trees 10 years ago to Visualisation through Tableau to data Science.

Is there any difference in the local and larger organisation information

Yes although it is basically the same structure with the main difference being communication. How you influence and present differs in how much gets recognised or understood. Local areas get sensitive information and larger organisations get more strategic insights. It's important to remember that it's a public service what that means is there are multiple customers and they all want something different. Still one of the most influential articles that influences us is "Management misinformation systems" by R L. Ackoff of giving the information and insight to those that understand it.

It is about managing expectation. Managing how customers expect the insights be deliver so they are relevant to them and in turn can managing expectations of the customers of the NHS. It is something we are expert in and this doesn't change. We are just changing how we get to the insight using more and more advanced data Science and understanding of the problems.

What is your experience with data Science

It's through Tableau that we have been able transform what we do using descriptive statistics and techniques of which logistic regression is something we have been doing for multiple years.

We have had sophisticated models for decades one is the Scottish Patient at risk readmission and admission SPARRA which predicts for 4.2 million people their risk of admission to AE. These are general statistics that require large accuracy. It is important to remember the size which is often in the millions and significance which often is an individual health risk. It is this size which is difficult to get away from because every user of the NHS is so different and has a different health risk. For a daily admission of 300 to AE in one Hospital there maybe 100 different health reasons were every individual is different. The direction is understanding these experiences more specific to the admission. We understand that not every value of a statistic is similar or have similar consequences so we should better represent this in our insights. We see a huge opportunity in the increase speed testing data Science model, opening or increasing available datasets and collision of data Science skills in a team.

My own experience has been through online courses to an intensive course led by the Datalab and the Data incubator in the US. I have then taken multiple online courses. I have led on Tableau and it's role out with our division.

We bring an understanding of statistic with how to present these having real time descriptive or diagnostic presentations and having predictive analytics to manage the NHS. We want the ability to align these more to the heterogeneous experiences in the NHS through data Science teams.

My role is an overseer to understand the benefit and limitation of these data Science team projects to give practical actions.

How do you see it benefiting what you deliver

For our SPARRA product we only have three age division old, middle and chaotic young. With data Science project teams we can test what happens when we use demographics, other datasets or more inference. This gives us the ability do more than predict but ask why these predictions are given.

We have had the opportunity to test this through hackathons which we recently had an Open data hackathon. This has proved that it's possible to manage. We have began to implement it by being involved in the CivTech project from the Scottish Government to help change our approach.

We are looking to prove data Science or it's advancements are do able in the NHS and become a data savvy organisation. It is about changing an approach and we maybe have the skills internally or training can be done. Although it's a change of what we deliver and an ability to try new approaches. This is why we are forming specific data Science team projects and considering external recruiting. It is for this reason we have become more agile and with a fail fast then learn approach to manage it.

We are essentially expecting 20 failures to get one success to give more understanding of what's happening in the organisation.

How are you going to form the data Science team

It is most about having the roles within the NHS and having the right amount for a project. The project, the understanding the problem and what's needed form the team. For this we foresee multiple data Science teams for individual projects. These are on demand teams for specific projects.

Specific roles

It important to remember that it is all data Science, all roles are data scientists and that there are many different roles. We have the approach that we don't want a lot do data scientist we want experts in just one role. So a team member is just an expert in data wrangling and another is expert in machine learning. These specific roles are although not limited by data metrics, data pipelining, data querying, hypothesis generators, data science modellers and machine learning modellers, informing and influencing experts, legal experts and data journalists.

Forming the team

Myself and my colleague Andy Gasiorowski, similar to the Company Valve, suggest a flat hierarchy where there are projects suggested and a selection of experts in individual expert roles. The teams form around projects in teams of size no more than 7 team members.

To form these teams relies on all team member agreeing on the opportunity and the importance of the problem. When you have the chance to maximise the number of operations done per month it gives teams an incentive. I agree with the forming, norming, storming, performing adjourning and have strategies for these. The performing is a trial and when there is one successful project then we can try to role this out on the whole organisation.

How is this a different approach

Before we didn't have dedicated teams for data Science, dataset had to be requested and by they time they arrived the team had adjourned. There was not the approach or direction to try alternatives. It is this change of approach to specify the project, let those data scientists specify these projects and form team with expert roles to do this. With management through KPI's, stakeholder involvements and management of direction against practical actions to improve our products.

How it's different to other DS Teams

When doing Data Science in a factory the range of data Science is small although our range is large because it has so many heterogeneous experiences and variables that changes these. It is mostly about trying many alternative directions. We going towards modelling individual experiences and how we can understand these. To do this we have defined specific projects with roles that are needed. The main thing that differs is the size of the dataset with the significance of the insights which mean you can make a difference and change people lives.

To read a longer script of the interview visit by blog. http://nodeontheedge.blogspot.co.uk/2017/05/team-node.html

Some of this is about using the same statistical analysis but just specifying it to individual heterogeneous experiences so the stream of data that is available mean something to those that use the insights from it. Some of this is about just trying models to get additional insights. It is mainly advancing descriptive, diagnostic and predictive analytics with a potential to do prescriptive analytics of modelling the consequences of change much further done the line. It is apparent that the size of datasets and significance of the insight halts data Science in the NHS and some of the solution is to have dedicated expert roles and well defined projects.

This main insight from this of the importance of defining experts role for a specific project highlights a failing of the data Science recruitment. That a role of data scientist or data engineer is too broad and that more specific roles are the ones that bring the most benefit. To help know which role are which there is an kdnuggets article and from the S2DS cohort I was in there was a project to identify roles from specifications. Once these roles are more clearly defined much more effective teams can be formed.

One of the other insights that was profound was that there was no mention of smart optimisation. It is not a concern to monitor every instance and instantly optimise to it. It is more of a concern to know more about the data they have. The NHS is run by people, experts, for people and they need to know what the data means. This is why this ability to try a variety of projects is a large opportunity to get insight that describe and predict more accurately. Once this is done it could become smart. The large difference in this data Science team is the impact and the potential is profound.

To find out more go to the website http://www.isdscotland.org/

By Gordon Rates - founder of AirNode - gordon@airnode.co.uk - @air_node - alumni of S2DS virtual

MIT recommendation to forming a team

http://hrweb.mit.edu/learning-development/learning-topics/teams/articles/stages-development

Valve software

http://www.valvesoftware.com/company/

Kdnuggets article

http://www.kdnuggets.com/2015/11/different-data-science-roles-industry.html

Y NOTE 2

There are many forms of graphs. In the node and edge sense this is a graph where there are entities, Nodes, which hold data. These nodes could be represented in circles and there are edges which are association of one node to another. These edges can be represent in lines from one node to another one. At a networking event the nodes could be all the attendees and the edge could be agreements and details of these made by attendee that met each other and hold information of what the action request they agreed on was. One can capture an active and rapid networking event in one representation. This makes a more in depth observation possible. A high level representation of everything that was agreed at the event with ability to deep dive into individual agreement and get to analyse more about the Node entities and Edge associations of them i.e this is the ability of Node and Edges Graph representation. Everything can be a Node or an Edge although knowing when something is a Node or an Edge is a skill in graph representation. An easy solution is to say nodes are things and edges are concepts. Node are more constant or to just more persistent. Edges are more dynamic, easier to setup and easier to dismantle. You can then put yourself in the picture. It become simple to setup when everything you encounter is a Node and your association to it is a concept. Your iPhone stand is a Node and the concept you own it the Edge. Your business card is a Node and all the instances you gave it out are edges. All these edges lead to other nodes of the contacts the business card was given to. The edges where significant follow up were done means the edge then become more significant and can represented with a stronger line. Being able to understand and represent this means we can represent a significant volume of concepts and associations. It is then simpler to evaluate advantages gained and lost with ability to reassess how or why these were gained to help learn to increase the amount of them. When the concept and associations are isolated and this graph representation is done its simple to know associations and increase these. The ability of this graph representation does not stop there. When you change the Nodes to other things to for example metal objects on a train engine that function together. The Nodes are these metal objects and edges are the level of quality one metal objects needs to be not to damage another metal object it is connected to. The edges state the level of quality needed and influence the level of quality of the metal objects needed. This gives an approximation of all the metal objects level of quality needed, the areas where higher level of quality is needed I.e. Potential fail Areas, areas where high quality would mean longer lasting and more. The graph represents levels of quality needed. With slight changes to the graph representation it can represent other a lot more.

This graph representation allow us to visualise this in a concept that's easy to understand in a short time scale.

It means we can start to begin to have larger understanding of everything. It becomes much simpler largely because this is loosely how we represent our knowledge. In neurons and synaptic connections the knowledge is represented. In this graph represent we present the knowledge similarly and it's easy to comprehend. There are many graph representation that can help and many allow searching of this for one concept to know everything that is associated with the concept.

Tuesday, 16 May 2017

Y NOTE

There many forms of graphs. In the node and edge sense this is a graph where there are entities which hold data. These nodes could be represented in circles and there are edges which are association of node to another. These edges can be represent in lines from node to another one. At a networking event the node could be all the attendees and the edge detailing those attendee that meet each other and hold information of what they agreed. One can capture an active and rapid networking event in one representation. This makes a more in depth observation there are entities and association of them i.e there are Node and Edges. Everything can be a Node or an Edge although knowing when something is a Node or an Edge is skill in graph representation. An easy solution is to say nodes are things and edges are concepts. You can then put yourself in the picture. It become simple that everything you encounter is a Node and your association to it is a concept. You iPhone stand is a Node and the concept you own it the Edge. Being able to understand and represent this being we can have larger understanding of everything. It become much simpler largely because this is loosely how we represent our knowledge. There are many representation that can help many allow search for one concept and knowing everything that is associated with the concept.

Friday, 5 May 2017

Team Node - How to form your team

One of the fundamental of Data Science is the team. Whether you prefer your teams more A- Team or more regional chess team the experience of being propelled and enhanced by your teammates is something anyone who has been in a successful team cannot deny is a benefit. So how do successful teams operate, form and how to be in one or form one. We all know successful teams whether the one were in or ones we are fans of.

Forming that team is a skills not often recognised. How to form and who to include are sometimes the most difficult to do. There are memorable strategies one of which is entitled forming, storming, norming, performing and adjourning. The forming is similar to a canopy reception you do forming through collecting your name badge and meeting a few attendees around a table. This is before the storming begins. Then the canopies arrive a few make polite indication they are listing while reaching out to grab the parsing canopy other chase the canopies around the room. Other rush over to the canopies. Other quietly wait until halfway until the caviar is there. This is the storming where egos occur, teams iron out their conflicts and teams start to gel. The norming is the conversation that the egg pastry was the one canopy to get. It is similar to being not being able to get on the metro and complaining to the others that the metro is decrepit. It is the gelling of team members over something. This bonds teams. The next is the performing when you someone else joins your table and you instantly perform in team to ask questions. The last is the adjourning when the networking session ends.

So we know doing data science needs a team and a team with a variety of skills. So how do you setup a strategy to form this team or be in one. There are some unifying things that you can isolate in most teams. These are, suggested by an MIT report, A shared understanding of the mission and commitment to goals, clearly defined responsibilities and roles, agreed ground rules, a decision making model, effective communication, mutual and self evaluation.

When you have all of this and how do form Data Science a team to come up with insights that propel your team to legend status.

To give you an example we are joined by a strategist and leader of a new data Science within the NHS. Ken Nicholson.

Where you do you find yourself in the NHS or what is your role and department

My title is Principle Information Analyst within the information services division of the NHS known by NHS NSS.

It is reporting information and insight to NHS boards.

What does your organisation do for the NHS and what does it provide

We provide management information in terms of Health Analytics and strategic management insights.

Mainly statistics to determine strategy anything that allows the NHS to more optimally manage its operations.

We provide information to NHS health board and NHS staff. We started with statistics then on to logistic regression and decision trees 10 years ago to Visualisation through Tableau to data Science.

Is there any difference in the local and larger organisation information

Yes although it is basically the same structure with the main difference being communication. How you influence and present differs in how much gets recognised or understood. Local areas get sensitive information and larger organisations get more strategic insights. It's important to remember that it's a public service what that means is there are multiple customers and they all want something different. Still one of the most influential articles that influences us is "Management misinformation systems" by R L. Ackoff of giving the information and insight to those that understand it.

It is about managing expectation. Managing how customers expect the insights be deliver so they are relevant to them and in turn can managing expectations of the customers of the NHS. It is something we are expert in and this doesn't change. We are just changing how we get to the insight using more and more advanced data Science and understanding of the problems.

What is your experience with data Science

It's through Tableau that we have been able transform what we do using descriptive statistics and techniques of which logistic regression is something we have been doing for multiple years.

We have had sophisticated models for decades one is the Scottish Patient at risk readmission and admission SPARRA which predicts for 4.2 million people their risk of admission to AE. These are general statistics that require large accuracy. It is important to remember the size which is often in the millions and significance which often is an individual health risk. It is this size which is difficult to get away from because every user of the NHS is so different and has a different health risk. For a daily admission of 300 to AE in one Hospital there maybe 100 different health reasons were every individual is different. The direction is understanding these experiences more specific to the admission. We understand that not every value of a statistic is similar or have similar consequences so we should better represent this in our insights. We see a huge opportunity in the increase speed testing data Science model, opening or increasing available datasets and collision of data Science skills in a team.

My own experience has been through online courses to an intensive course led by the Datalab and the Data incubator in the US. I have then taken multiple online courses. I have led on Tableau and it's role out with our division.

We bring an understanding of statistic with how to present these having real time descriptive or diagnostic presentations and having predictive analytics to manage the NHS. We want the ability to align these more to the heterogeneous experiences in the NHS through data Science teams.

My role is an overseer to understand the benefit and limitation of these data Science team projects to give practical actions.

How do you see it benefiting what you deliver

For our SPARRA product we only have three age division old, middle and chaotic young. With data Science project teams we can test what happens when we use demographics, other datasets or more inference. This gives us the ability do more than predict but ask why these predictions are given.

We have had the opportunity to test this through hackathons which we recently had an Open data hackathon. This has proved that it's possible to manage. We have began to implement it by being involved in the CivTech project from the Scottish Government to help change our approach.

We are looking to prove data Science or it's advancements are do able in the NHS and become a data savvy organisation. It is about changing an approach and we maybe have the skills internally or training can be done. Although it's a change of what we deliver and an ability to try new approaches. This is why we are forming specific data Science team projects and considering external recruiting. It is for this reason we have become more agile and with a fail fast then learn approach to manage it.

We are essentially expecting 20 failures to get one success to give more understanding of what's happening in the organisation.

How are you going to form the data Science team

It is most about having the roles within the NHS and having the right amount for a project. The project, the understanding the problem and what's needed form the team. For this we foresee multiple data Science teams for individual projects. These are on demand teams for specific projects.

Specific roles

It important to remember that it is all data Science, all roles are data scientists and that there are many different roles. We have the approach that we don't want a lot do data scientist we want experts in just one role. So a team member is just an expert in data wrangling and another is expert in machine learning. These specific roles are although not limited by data metrics, data pipelining, data querying, hypothesis generators, data science modellers and machine learning modellers, informing and influencing experts, legal experts and data journalists.

Forming the team

Myself and my colleague Andy Gasiorowski, similar to the Company Valve, suggest a flat hierarchy where there are projects suggested and a selection of experts in individual expert roles. The teams form around projects in teams of size no more than 7 team members.

To form these teams relies on all team member agreeing on the opportunity and the importance of the problem. When you have the chance to maximise the number of operations done per month it gives teams an incentive. I agree with the forming, norming, storming, performing adjourning and have strategies for these. The performing is a trial and when there is one successful project then we can try to role this out on the whole organisation.

How is this a different approach

Before we didn't have dedicated teams for data Science, dataset had to be requested and by they time they arrived the team had adjourned. There was not the approach or direction to try alternatives. It is this change of approach to specify the project, let those data scientists specify these projects and form team with expert roles to do this. With management through KPI's, stakeholder involvements and management of direction against practical actions to improve our products.

How it's different to other DS Teams

When doing Data Science in a factory the range of data Science is small although our range is large because it has so many heterogeneous experiences and variables that changes these. It is mostly about trying many alternative directions. We going towards modelling individual experiences and how we can understand these. To do this we have defined specific projects with roles that are needed. The main thing that differs is the size of the dataset with the significance of the insights which mean you can make a difference and change people lives.

To read a longer script of the interview visit by blog.

Some of this is about using the same statistical analysis but just specifying it to individual heterogeneous experiences so the stream of data that is available mean something to those that use the insights from it. Some of this is about just trying models to get additional insights. It is mainly advancing descriptive, diagnostic and predictive analytics with a potential to do prescriptive analytics of modelling the consequences of change much further done the line. It is apparent that the size of datasets and significance of the insight halts data Science in the NHS and some of the solution is to have dedicated expert roles and well defined projects.

This main insight from this of the importance of defining experts role for a specific project highlights a failing of the data Science recruitment. That a role of data scientist or data engineer is too broad and that more specific roles are the ones that bring the most benefit. To help know which role are which there is an kdnuggets article and from the S2DS cohort I was in there was a project to identify roles from specifications. Once these roles are more clearly defined much more effective teams can be formed.

One of the other insights that was profound was that there was no mention of smart optimisation. It is not a concern to monitor every instance and instantly optimise to it. It is more of a concern to know more about the data they have. The NHS is run by people, experts, for people and they need to know what the data means. This is why this ability to try a variety of projects is a large opportunity to get insight that describe and predict more accurately. Once this is done it could become smart. The large difference in this data Science team is the impact and the potential is profound.

To find out more go to the website http://www.isdscotland.org/

By Gordon Rates - founder of AirNode - gordon@airnode.co.uk - @air_node - alumni of S2DS virtual

MIT recommendation to forming a team

http://hrweb.mit.edu/learning-development/learning-topics/teams/articles/stages-development

Valve software

http://www.valvesoftware.com/company/

Kdnuggets article

http://www.kdnuggets.com/2015/11/different-data-science-roles-industry.html