Flaky Tests as Sticking Points in Software Development

Flaky Tests as Sticking Points in Software Development

Flacky Tests
Flacky tests

The Impact of Flaky Tests on Software Quality and the Ways to Reduce It

Flaky tests are automatically performed tests, which do not always pass or fail at all, regardless that their code is stable and was not changed. These tests are unpredictable. That is why, they pass or do not pass by occasion, despite they are made in equal conditions.

These flaky tests are usually problematic as they can cause false positive results (when the passed test is false in fact) and false negative results (when the test which is not passed is true in fact). As the result, the developers may lose their time and force.

To avoid test flakiness, the developers should implement various measures, such as insolation of the tested environment, repeating attempt mechanisms, increasing of waiting time or log analyses and indicators to detect main problems.

What Does the Flaky Test Mean?

A flaky test is a kind of test that gives unreliable and contradictory results, which leads to its failure and contradictory passing. In other words, it may be possible when the unreliable test fails or passes unexpectedly, even if the tested code was not changed.

The test is flaky for various reasons, including environmental problems, timing problems, hurry-up conditions, or a problem with implementing tests. For example, a test, which depends on an external resource, which may not be always accessible, may be flaky as it may pass or fail incoherently, depending on resource availability.

Examples of Flaky Tests

Here are some typical occasions considers, each of them can be used as a flaky test example:

The tests which depend on the network: if the automatically made test depends on the network connection, for example, a test checking data got from an external API, it becomes flaky when the network connection is unstable or too slow.

Tests dependent on time, which are based on certain parameters of time, such as timeouts or waiting periods, may become flaky if there is a small delay in the tested system. For example, a test checking the webpage response time sometimes passes or fails. It depends on the load of the server or network.

Tests related to concurrency, which are performed simultaneously and can interact with each other causing flaky behavior. For example, a test that records the database in the same table as another test can sometimes finish with an error, depending on the test performing order.

Tests that depend on the environment. Such as availability of certain resources can become flaky when the environment changes. A good flaky test example, in this case, is a test checking a file presence in a file system, which can end with an error, if a file is removed by another process.

Main Causes of Flaky Tests

There are different reasons for test flakiness, including issues with the environment, such as network connection or server productivity, synchronization problems, such as race conditions or timeouts, or problems with the code (problems with concurrency or incorrectly processed exceptions).

Here are some common reasons for flaky tests:

  • Problems with time
  • Conditions of race
  • Environment problems
  • Problems with the implementation of tests
  • Dependencies on external resources
  • Incomplete coverage of tests

How to Detect Flaky Tests?

Detecting flaky tests can be problematic because of the test flakiness diversity, time problems, conditions of race, and unreliable data of tests.

While detecting flaky tests some common problems may arise, such as test may not appear immediately or mistakes, caused by false positives or false negatives, unreliable environment, or problems with test design.

Here are Some Flaky Test Detection Ways:

To analyze the results of tests and realize which tests do not correspond each other. It will help to reveal patterns pointing to the instability:

To monitor test performing to find out inconsistent tests by performing one test several times and comparing results.

To record the time of test performing for each test and then compare it with the average time of execution. If a test takes considerably more time, than the average time of execution, it can point to the fact that this test is flaky.

To use the tools for code analysis, such as SonarQube or, for example Code Climate to detect potentially bad quality tests. These tools can detect even smells of a code, coverage of tests and other flakiness signs.

To make tests running in parallel which can help to detect tests, which are not working. If a test fails in an inconsistent manner, its performance together with other tests will help to detect the cause of such failure.

To track the dependencies of tests, as tests depending on external resources or services can be unstable. You should check the availability and agreement of these resources to make potentially flaky test detection possible.

To view a code of tests for revealing potential causes of such test flakiness, including synchronization of threads, sleep statement usage and conditions of race.

In general, to detect flaky tests you need to combine monitoring, analysis, and code checking. If you reveal and remove the test flakiness in time, it will allow you to increase the testing accuracy and reduce the flaky testing.

How to Fight With Flaky Tests?

Fighting with test flakiness requires different techniques and process approaches, which can be combined in practice. There are some useful strategies that can work:

Flaky test identification and prioritization. It is necessary to identify test, which is not working, and prioritize them depending on their impact on the software quality and development time. Prioritizing failed tests can help developers to focus on the most crucial issues.

Correction of flaky tests. After developers found unstable tests, they need to remove them and their initial cause. It may include the refactoring of a testing code, corrections of race conditions, improvement of the mechanisms of synchronization as well as reduction of the dependency on external resources.

Test automation. It can reduce the probability of bad quality tests ensuring more consequent and reliable results. Automation will also help to reveal bad-quality tests more quickly and precisely.

To perform tests in parallel. Launching tests in parallel helps to reveal flaky tests by starting them in different environments or at different times. It helps to detect the problems with surrounding and time, which can be the reason for the test flakiness.

Isolate tests can help to reduce the impact of unstable tests by isolating them from other ones. It may prevent flaky tests from leading to the failure of other tests.

Test results monitoring. It is recommended to track the results of testing to detect bad quality tests and their frequency or impact on the quality of the software. It helps to reveal trends and patterns, which can point to the test flakiness.

Improvement of testing coverage, which can help in the reduction of bad quality test probability due to more full testing of the software. It can help to reveal and remove problems before they become bad quality tests.


Here is the table, which summarizes the flaky test reasons, their consequences, and recommended remedies:


Problems with time

Conditions of race

Problems with environment

Problems in test performing

Dependence on external resources

Not full coverage of testing


Tests can pass or fail in an inconsistent manner because there are some timing factors, for example, the network delay, delay in input or output, and waiting time.

Tests pass or fail in an insequent manner if ordering of parallel threads and processed are unpredictable.

Tests can pass or fail in an inconsistent manner as testing environments may be different compared to the production environment, for example, due to the different dependency versions or configuration of the equipment.

Tests can pass or fail in an inconsistent manner because there are problems with the code in the test, such as incorrect affirmation, and improperly cleaned-up tested data which depend on the performance order.

Inconsistent passing or failure of tests may be caused by the dependence on external resources, including databases, services of third parties, or API.

Inconsistent test passing or failure may be caused by the test set does not cover all possible variants of development and edge cases.

Proposed remedies

Using time waiting and repeatable attempts, launching tests in parallel and insulating them using virtual environments, which ensures agreed testing surrounding.

Using synchronizing mechanisms, including blockings, semaphores and barriers which manage access to general sources. 

Using imitation objects and plugs to insulate tests from external dependencies by application of virtual environments or containers ensuring an agreed testing environment, and tracking the production environment on differences, which can cause test flakiness.

Use test refactoring for code quality improvement, increasing their reliability and accuracy, ensuring the proper cleaning up of the tested data, and making tests independent on the order of their performance.

Using plugs and layouts for imitation of external resources, using test tweens for external resource imitation, reduction of dependencies on external resources, using a database in the memory, or other alternative remedies.

Improving the coverage of tests for ensuring the coverage of all paths of a code and edge cases, using mutated testing or other measures to detect white spaces in the coverage of tests.

It is worth noting, that all these means do not always work perfectly. The best approach may depend on each case, test, or nature of a test case that failed. In addition, prioritizing and failed test removal are very crucial as they influence the quality of the software and the time of development.


In general, failed tests can impact badly the soft quality, time of development, and issuing cycle. It is very important to reveal and remove them as soon as possible. Fighting with flaky tests requires a very active approach, including detection, prioritizing, and solving problems quickly and effectively. By applying the remedy strategies, developers can increase the reliability and efficiency of their attempts for testing and create better software.

What is DevSecOps Pipeline and Why It’s Important?

What is DevSecOps Pipeline and Why It’s Important?

DevSecOps pipeline

The Importance of Integrating DevSecOps Pipeline into the DevOps Workflow

You might know that DevSecOps is all about security measures in the software development process, but how should it look in practice? How can you use it to create a secure CI/CD pipeline? What are the main DevSecOps phases? What is the definition of DevSecOps? And which tools should you use in a typical Dev Sec Ops pipeline? In this article, we will answer these questions, we will discuss the secure pipeline in detail, including its benefits, components, and best practices for implementing it. We will also provide case studies of companies that have successfully implemented the DevSecOps pipeline.

Cyber Threat Scenario

Let’s you own a hypothetical software development firm. You’re proud of your team of talented developers using cutting-edge technologies and generating innovative ideas. One day, however, disaster struck. The company fell victim to a cyber attack, and its systems were breached by a group of hackers. The hackers stole sensitive customer information, including personal and financial data, and brought the entire company to its knees.

In the aftermath of the attack, the company struggled to recover. Your reputation was tarnished, and your customers lost faith in your ability to protect their information. The company’s finances took a hit, and many employees were left without jobs.

This scenario is not that outlandish. Around 60 % of small businesses go down within 6 months after being hacked. That’s why it’s crucial to learn this valuable lesson. You’d better start working on creating a comprehensive security plan to protect your systems, your data, and your customers.

Start educating your employees on the importance of security. Implement best policies and procedures to ensure that all employees follow best practices when it came to protecting sensitive data. Invest in cutting-edge security technologies, including firewalls, intrusion detection systems, and advanced encryption methods.

Over time, your efforts will be paid off. When your systems are secure, your customers will be confident that their information is safe. Implementing security measures is crucial for any business that deals with sensitive information. Cyber attacks are a real threat, and they can have devastating consequences. But with the right approach and the best security measures in place, you can protect your data and your customers.

What is DevSecOps Pipeline?

If you’re unfamiliar with the concept, it may seem too complex and even scary at first. Cynics may even say it’s the perfect way to add even more complexity to your already complex tech processes. Who needs simplicity and efficiency when you can have a pipeline that’s so convoluted, it requires a whole new set of skills just to navigate it?

Just think about all the extra steps you get to take to make sure your code is secure. Plus, who doesn’t love waiting for those security scans to finish before moving on to the next step? It’s like a fun game of “will this pass or fail?” every time!

And let’s not forget the joys of collaboration between developers, security experts, and operations professionals. Clear communication? Never heard of it. Instead,  you have misunderstandings at every turn. It may look like a game of telephone, but with your codebase.

If you think that DevSecOps Pipeline is the ultimate solution for anyone who loves to add extra layers of complexity and chaos to their tech processes, likes playing a never-ending game of whack-a-mole, except the moles are your code vulnerabilities and they just keep popping up no matter how many times you hit them, we suggest reading this article. You may change your mind about the topic.

In reality, DevSecOps is the best way of integrating security measures into every step of the software development life cycle. The traditional approach to software development, which involved developing software in silos by different company departments or outsourcing contractors and then handing it off to the security team for testing, is no longer sufficient in today’s fast-paced, agile environment. Developers and operations teams need to work together to ensure that security is built into every step of the development process.

The Basics of DevSecOps Pipeline

DevSecOps pipeline is an approach to software development that integrates security into the DevOps workflow. It is based on the principle that security should be built into every phase of the development process, from planning and design to coding, testing, and deployment. Take a look at a typical DevSecOps pipeline diagram:


Picture this: you’re the captain of a ship sailing the vast ocean of software development. You have a skilled crew of developers, operations personnel, and security experts all working together to ensure a smooth voyage. But what if we told you that you could make your journey even smoother with the power of DevSecOps?

With DevSecOps pipeline architecture, you can spot any lurking sea monsters early in the journey, when they’re still small and easy to handle. This means you can save time and money by avoiding any costly detours or battles with larger, more dangerous beasts later on.

Not only that, but with the magic of automation, your crew can focus on more important tasks, like charting your course and making sure your ship is running smoothly. This means you can cover more ground and reach your destination faster, all while keeping an eye out for any potential threats.

And let’s not forget about the benefits of better collaboration between your team members. With DevSecOps, everyone is working together towards a common goal, making it easier to communicate and share ideas. It’s like having a well-oiled machine, where everyone knows their role and works seamlessly together.

But perhaps the most important benefit of all is that security becomes a shared responsibility across the entire crew. No longer is it just the responsibility of a few security experts – everyone on board is responsible for ensuring the safety and security of your journey.

So what are you waiting for? Set sail with DevSecOps and discover the true potential of your software development journey.

Components of DevSecOps Pipeline

DevSecOps pipeline is made up of several components, each of which plays an important role in ensuring that security is built into the development process.

Source Code Management (SCM)

Source code management is one of the most crucial components of a DevSecOps pipeline. It involves the use of a version control system to manage changes to the code base. This allows developers to collaborate on code, track changes, and roll back to previous versions if necessary. The most common SCM tools are:

  • Git
  • SVN
  • Mercurial

Continuous Integration (CI)

Continuous integration is a process in which developers integrate code changes into a central repository on a regular basis. The code is then automatically built and tested, and any issues are identified and resolved immediately before they cause more serious problems. This process ensures that the code is always in a working state and that any issues are identified and addressed early in the development process. Popular CI tools include:

  • Jenkins
  • CircleCI
  • Travis CI

Continuous Deployment(CD)

Continuous Deployment, or CD for short, allows you to effortlessly deploy your code changes to production. The process allows automating the way code goes through the build, testing, and deployment process all on its own, with minim-to-none human intervention.

With CD, you can say goodbye to the days of manual deployments and the risk of human error that comes with them. Instead, you can rest assured knowing that the process is fully automated and any issues are identified and addressed before the software is released to production.

It’s like having a personal assistant who takes care of all the mundane tasks so you can focus on the more important things. CD tools work tirelessly in the background, ensuring that your code is always in a releasable state and that deployments are consistent and repeatable.

CD is not just a tool, it’s a mindset. It’s about embracing a culture of continuous improvement and constant feedback. With CD, you can deliver software faster, with higher quality, and with less risk. It’s a game-changer that can transform the way you develop and deploy software.

Some popular CD tools include:

  • Octopus deploy
  • Argo CD
  • Harness

Security Testing

Just like a secret undercover operation, security testing is the stealthy and strategic process of identifying any hidden vulnerabilities that could pose a threat to the software’s security. It’s a tireless part of the DevSecOps pipeline constantly scanning the code for any potential breaches.

Security testing can be conducted using different techniques, from the classic method of manual testing to the modern approach of automated testing. Just as a master thief would carefully analyze every aspect of a building’s security system, static code analysis, dynamic application security testing (DAST), and software composition analysis (SCA) are the tools that security testers use to assess the software’s defenses.

These techniques help testers identify potential security flaws such as cross-site scripting, SQL injection, and buffer overflows. Like a skilled detective, security testing allows developers to anticipate and thwart any malicious intent before it becomes a serious threat.

In the end, the software emerges as a fortified fortress, ready to stand up to any attacks. Security testing relies on precision to ensure that the software is well-equipped to handle any security issues that may arise.

For this purpose, you can use such tools as:

  • SonarQube
  • Veracode
  • Checkmarx

Infrastructure as Code (IaC)

Infrastructure as Code is the practice of managing infrastructure using code, rather than manual processes. This involves creating scripts or configuration files that define the desired state of the infrastructure, which can be version-controlled and automated. Popular IaC tools include:

  • Terraform
  • Ansible
  • Chef


Containerization involves packaging an application and its dependencies into a lightweight, portable container. Containers can be easily deployed across different environments, making it easier to scale applications and maintain consistency. Docker is the most widely-used containerization tool.

Monitoring and Logging

Monitoring and logging are essential for detecting and diagnosing issues in production environments. To monitor and analyze application and system metrics, logs, and alerts, you can use such tools as:

  • Prometheus
  • Grafana
  • ELK Stack

Security in Design

Security in design is the practice of designing software with security in mind from the beginning. This involves identifying potential security risks and designing the software to mitigate those risks. For example, suppose the software requires users to enter sensitive information, such as credit card or social security numbers. In that case, the software should be designed to encrypt that information to protect it from hackers.

Implementing DevSecOps Pipeline


If you want to boost your organization’s efficiency, security, and collaboration, then you need to get on board with DevSecOps pipelines. These practices can help you deliver lightning-fast software, rock-solid security, and seamless teamwork between development and operations teams. But let’s not kid ourselves, there will be some challenges to overcome along the way. Don’t worry though, with a little bit of grit and determination, you can conquer any obstacle that comes your way.

1. Cultural Shift

You need to be prepared for a major cultural shift. The truth is, DevSecOps demands a continuous and collaborative approach to software development and deployment, which can be a major challenge for organizations that are used to working in silos. But this is one of those challenges that can be and should be overcome. To make DevSecOps work, you need to break down those barriers and create a culture of collaboration and shared responsibility. When everyone is on the same page and working towards the same goal, amazing things can happen. So don’t let a little cultural shift hold you back from realizing the full potential of DevSecOps.

2. Tooling Integration

Another challenge of implementing DevSecOps is integrating the various tools and technologies required to support the software development pipelines. This can include integrating source code management, continuous integration/continuous delivery (CI/CD) tools, security testing tools, and infrastructure-as-code (IaC) tools. Ensuring that these tools are properly integrated and working together can be complex and time-consuming.

3. Security Skills Shortage

DevSecOps requires a strong focus on security, which can be challenging for organizations that lack security expertise. This can lead to a skills shortage, with a lack of qualified security professionals available to oversee the security aspects of the pipeline. To address this challenge, organizations may need to invest in training and education for their existing staff, or consider partnering with external security experts.

4. Compliance and Regulation

Many industries are subject to strict regulatory and compliance requirements, which can make implementing DevSecOps more challenging. Compliance requirements may include ensuring data privacy, maintaining audit trails, and demonstrating compliance with industry standards. Organizations need to ensure that their DevSecOps pipeline meets these requirements, which can add additional complexity and cost.

5. Legacy Systems and Applications

Finally, legacy systems and applications can pose a challenge to implementing DevSecOps. Legacy systems may be difficult to integrate with modern tools and technologies, or may not be designed to support a continuous delivery approach. This can make it challenging to fully automate the pipeline and achieve the desired benefits of DevSecOps.

Organizations need to address these challenges in order to successfully implement DevSecOps, including fostering a collaborative culture, integrating tools and technologies, addressing security skills shortages, complying with regulations, and managing legacy systems and applications.

How to Improve Collaboration in Your Team?

If you want everyone in your organization to cooperate better, just force them to work on group projects, even if they hate each other’s guts. It doesn’t matter if the project is completely irrelevant to their job description or if they have no interest in it whatsoever. Just make them do it. That’ll bring them closer together.

And don’t forget the team-building exercises. Because nothing screams “collaboration” like playing trust games with your coworkers. Blindfolded, you must trust that your colleague will catch you before you hit the ground. And if they don’t? Well, it’s all good fun, right?

Also, you may consider the open office concept. Why have walls and doors when you can have a shared workspace where everyone can see and hear each other? And those who want to escape constant distractions and interruptions can just wear headphones listening soothing music.

If all else fails, just force your employees to socialize outside of work. Schedule mandatory after-work drinks and make sure everyone attends. Because if they don’t want to be friends with their coworkers, they’re obviously not team players.

In conclusion, if you want to create a collaborative culture in your organization, just force it upon your employees. They’ll thank you for it…eventually.

Best practices to follow to ensure success

There are some DevSecOps steps that organizations can take to ensure success.

1. Automate Everything

In all of the phases of DevSecOps pipeline, automation is an absolute game-changer. By automating key processes like development, testing, and deployment, organizations can slash the time and money it takes to develop software, while also ensuring that security is baked into every single step of the process. It’s a total win-win situation that you don’t want to miss out on. So if you’re ready to streamline your development process and level up your security game, automation is the way to go.

2. Create a Culture of Collaboration

DevSecOps pipeline requires collaboration between development, operations, and security teams. To create a culture of collaboration, organizations should:

  • Foster open communication between teams
  • Encourage cross-functional teams
  • Provide training and resources to help teams understand each other’s roles and responsibilities
  • Reward teams for working together to achieve common goals
  • Implement Security Testing Early and Often


To ensure that security is built into every step of the development process, organizations should implement security testing early and often. This includes using DevSecOps pipeline tools such as static code analysis, dynamic application security testing (DAST), and software composition analysis (SCA) to identify and mitigate security vulnerabilities.

Use Secure Coding Practices

Secure coding practices are essential for building secure software. Developers should be trained in secure coding practices and should follow coding standards such as OWASP Top 10 and CWE/SANS Top 25.

Case Studies

Many well-established companies have successfully implemented the DevSecOps CI/CD pipeline in their operations. Here are some of the most prominent DevSecOps pipeline examples:


Netflix is a streaming service that uses DevSecOps pipeline to ensure that its software is secure and reliable. The company has a team of security experts who work closely with developers and operations teams to identify and mitigate security vulnerabilities. Netflix uses tools such as static code analysis, DAST, and SCA to automate security testing and ensure that security is built into every step of the development process.

Capital One

Capital One is a financial services company that has implemented DevSecOps pipeline to ensure the security of its software. The company uses automation tools to speed up the development process and ensure that security is a priority at every step of the way. Capital One also employs a security team that works in cooperation with developers and operations teams to identify and mitigate security vulnerabilities.

Aim at the Future

As the world advances, so too does the art of software development. The future is a canvas yet to be painted, a world yet to be explored.

Software development has come a long way since its inception, and the future promises even more innovation. Imagine a world where software not only understands what you want but anticipates your needs before you even know them. Where machines work in tandem with humans to create software that is not just functional, but intuitive and immersive.

The future of software development is not just about writing lines of code, but about creating experiences that transform the way we interact with technology. It’s about understanding the nuances of human behavior and incorporating that into software design. It’s about creating software that is accessible to all, regardless of ability or language.

Artificial intelligence and machine learning will play a critical role in the future of software development. With the ability to analyze vast amounts of data, machines will be able to identify patterns and trends that humans may miss, leading to faster and more efficient software development.

In the future, software development will also be more decentralized and collaborative. Teams will work together, sharing code and ideas in real-time, regardless of their location. The rise of open-source software will only accelerate this trend, leading to a more transparent and inclusive development process.

As we move forward, the future of software development is limited only by our imagination. The possibilities are endless, and the potential for innovation is limitless. Let us embrace this future, and create software that not only solves problems but inspires and delights us in ways we never thought possible.


DevSecOps pipeline is a groundbreaking methodology for software development that fuses security into the heart of the DevOps workflow. This forward-thinking approach allows organizations to identify and eliminate security vulnerabilities at the earliest stages of development, saving valuable time and resources.

By incorporating security into every facet of the development process, teams can reduce the need for costly security testing later on and establish a culture of collaboration. This ensures that security is a shared responsibility across the organization, fostering a sense of teamwork and cooperation.

To put DevSecOps pipeline into practice, organizations should prioritize automation, cultivating a culture of collaboration and implementing security testing from the outset. By using secure coding practices, organizations can build top-tier software that meets the demands of both their clients and stakeholders.

Adopting the best practices of DevSecOps pipeline is the key to unlocking the full potential of software development, ensuring a streamlined, secure, and high-quality process. With this groundbreaking methodology, organizations can stay ahead of the curve and deliver exceptional software solutions.

Bias in AI Problem In Life and Technology

Bias in AI Problem In Life and Technology

Bias in AI
Bias in AI

What is Bias in AI and How to Avoid It?

When we are weighing things, events, or people using different ways for various goals, the algorithms cannot be neutral. Thus, to develop solutions for the creation of impartial systems of artificial intelligence, we need to understand these biased algorithms. The goal of this article is to reveal the AI Bias sense, its types, bias in ai examples, and how to mitigate risks associated with them.

First, let us define what AI Bias is.

What is Bias Algorithms and Why They are Important?

Bias Algorithms are the types of algorithms describing computer system repeating and systematic errors, which lead to unfair results, such as the preference of one random user group over other groups.

Two types of Bias in AI exist. One is the AI algorithm Bias is trained with a Biased system of data.  Another type of AI Biases is bias AI in society. Here our social norms and assumptions make us have blanks or some definite expectations in our minds.

For instance, a fair algorithm of a credit ranking can refuse you in giving a loan, if it constantly weighs appropriate financial indicators.

Why bias algorithms are so significant?

The explanation is simple – people write algorithms, select data, that these algorithms use, and decide about the application of these algorithms’ outcomes. People may accept such subtle and unconscious AI biases without various commands and careful thorough training, which can lead IA to automatize and immortalize them.

Application Bias in Machine Learning

Machine learning bias sometimes the name Bias in AI is a kind of event when algorithms create outcomes that always have a form of biases systematically as machine learning has wrong assumptions.

There are follow wing common Bias AI known:

Algorithmic types of biases

This event takes place when an algorithm has such a problem with computations with support calculations for machine learning.

Bias of samples

It occurs if a certain problem with data intended for training a model for machine learning appears. The data of this kind of machine learning bias are not too much big or suitable enough for teaching the system. For instance, when we use the data for teaching which foreseen only women teachers making tuition of the system, the conclusion arises that all tutors have only female gender.

Preconceived artificial intelligence bias

Here, we use records for tuition in the system accounting for actual preconceptions, stereotypes, or wrong social assumptions that can introduce these true biases in computing learning. For instance, we use the medical specialists’ data that include only women nurses and men doctors, thus, creating a timeless stereotype of medical employees in machine systems.

Measurement AI Bias

As its title says, this bias in AI is caused by the fact that data are not enough precise and the measurement and evaluation of data. If a system intended for an assessment of the workplace area is touted with the help of the photos of happy employees, it can be a biased system, if these employees already knew that the purpose of their training was the achievement of luck. When the system is trained to evaluate the share, it will have a bias type, if the shares in the data for such tuition were successively surrounded.

Bias of exception

It takes place when an important data period stays beyond the data which are applied, which means, something occurs, when the developers refuse to acknowledge the data period as indirect.

The Most Common Bias in AI Examples

Bias in AI is a belief, which is not based on famous facts about a person or a certain group of persons. Thus, there is a well-known belief that females are weak, however many women worldwide are known for their strength. Another one belief – all black people are not honest, but in fact, most of them are honest.

The meaning bias algorithms describe repeatable systematic mistakes, which lead to unfair results. For instance, loan ranking algorithms can refuse to issue a credit, even it is fair if is constantly weighing appropriate financial indicators. If this algorithm provides credits for one customer group but refuses to give them to another group of customers, which are almost the same, based on unlinked criteria, and this kind of behavior repeats several times, we can call it AI algorithm bias in this case. It can be intended or not intended bias, it, for instance, can come from the biased records received from an employee, who performed a job, which will be made by an algorithm from this moment).

Let us consider an example of an algorithm for recognizing faces, which can more easy thought to detect a white person, than a person with black skin, because this type of data is more often used in tuition. The minors can suffer from it as equal opportunities are not possible in discrimination and oppressing can be endless. These biases are not intended and can be hardly revealed until they are programmed with appropriate soft, and this is the problem.

Here are some common Bias in AI examples we can face in real life:

Racism in the medical system of the USA

Technology must facilitate the reduction of health inequality, but not make it worth it when the population fights with continuous preconceptions. Artificial intelligence systems learned on the basis of health data, which is not representative, usually work badly with not enough represented population groups.

A scientist in the USA discovered in 2019 that the algorithm used in American hospitals for the prediction of which patients need medical care gave a privilege to white patients more than to black ones by a great margin. As medical care expenses indicate the needs of a human in medical care, this algorithm takes into account the health expenses of patients in the past.

This figure was associated with race in a significant grade. Black people with the same diseases pay less for medical care, than white ones with the same problems. The scientists and the medical service provider Optum cooperated to make the Biased system less by 80%. Although, if there were no doubts about artificial intelligence, the AI preconditions would have discriminated against black people.

Imagination that CEOs can exclusively men

27% of Chief directors are women. Although, according to the reports of 2015, 11% of people emerging in Google picture search by the key „CEO“ were female representatives. Later, Carnegie Mellon University made is independent study and concluded that the online advertising Google showed more high-income positions for males, than females.

Google reacted indicating that advertisers can point to the persons and web portals to which the search engine must show this advertising. One of the features set by the companies is gender.

Nevertheless, it has been an assumption, that the algorithm of Google could define itself that men are more suitable for leading positions at companies. Researchers think Google could make it based on the behavior of the users. If, for example, men are the only people who see and click on the ads for high-income vacancies, the algorithm will be able to learn to give these ads only to males.

AI Bias algorithm common in personnel hiring by Amazon

Automation played a key role in Amazon’s domination over other companies in e-commerce. Some people, who worked with the company, stated that it uses artificial intelligence in hiring staff to assign 1 to 5-star rankings to job seekers, which was similar to the customer’s estimate products on the Amazon platform. When the company noticed that is new Biased system cannot assess the job seekers who are looking for software developers positions and other leading positions in a gender-neutral way, mostly because it was biased concerning women, the company made necessary adjustments to create a new non-biased ranking system.

After analyzing the summary of the computer model of Amazon, the similarities are in the applications of candidates. Most applications were drawn by males, which certifies that there are more men in this area. The algorithm in Amazon concluded, that male candidates are preferable. Thus, it punished CVs containing that a job seeker was a woman. It also reduced the number of applications from those people who visited one of two women’s educational establishments.

After that Amazon made software changes to make them neutral in relation to these keys. However, it does not prevent emerging of other AI Biases during its work. HRs used the proposals of the tool for searching for new staff, but never fully depended on these ratings. After the Amazon leadership lost their belief in this initiative, the project was closed in 2017.

AI Bias algorithm common in personnel hiring by Amazon

Automation played a key role in Amazon’s domination over other companies in e-commerce. Some people, who worked with the company, stated that it uses artificial intelligence in hiring staff to assign 1 to 5-star rankings to job seekers, which was similar to the customer’s estimate products on the Amazon platform. When the company noticed that is new Biased system cannot assess the job seekers who are looking for software developers positions and other leading positions in a gender-neutral way, mostly because it was biased concerning women, the company made necessary adjustments to create a new non-biased ranking system.

After analyzing the summary of the computer model of Amazon, the similarities are in the applications of candidates. Most applications were drawn by males, which certifies that there are more men in this area. The algorithm in Amazon concluded, that male candidates are preferable. Thus, it punished CVs containing that a job seeker was a woman. It also reduced the number of applications from those people who visited one of two women’s educational establishments.

After that Amazon made software changes to make them neutral in relation to these keys. However, it does not prevent emerging of other AI Biases during its work. HRs used the proposals of the tool for searching for new staff, but never fully depended on these ratings. After the Amazon leadership lost their belief in this initiative, the project was closed in 2017.

How AI Bias Can be Prevented?

Based on the above-mentioned issues, we would like to propose some ideas to overcome occurring of Bias algorithms in our life and work.

Trying machine teaching Bias algorithms in life

For example, candidates for a job. The AI-based decision you made may not be trustworthy if the information of your computer tuition system is given by a certain group of candidates. Although it cannot be a problem, if you apply artificial intelligence to the same seekers, the issue occurs when you apply it to another group of candidates, which your data set did not include before. In this case, it looks like you ask the algorithm to apply the preconditions, which it found out about the previous seekers, to the group of people with the wrong assumption.

To prevent this artificial intelligence bias and find a solution, you need to perform testing for the algorithm in such a way as you could use it in your practical life.

Accounting for justness in Bias in AI prevention

Moreover, we should understand that the term “justness” as well as the way it is calculated must be discussed. It can change under an influence of external factors, which means the AI should consider such changes as well.

Scientists already created many methods to make artificial intelligence systems meet them, such as the preliminary treatment of data, changing the choice of postpartum system, or integrating a certain justness into a tuition program. The contrafactual justness is its method warrantying that the choice of the model would be equal in the contrafactual environment where susceptible features, such as gender belonging, race type, or a sexual focus.

Considering the “Man in a cycle” system

The purpose of the “Man in a cycle” system is to make what a man or a computer cannot do themselves. In case a PC is not able to address an issue, people must help and find a solution instead of a machine. This procedure causes an unbroken feedback cycle.

This unbroken feedback teaches the system and increases its productivity at every further launch. Thus, the participation of a human in this cycle leads to more precise seldom data sets and increased safety and accuracy.

Creating a non-biased system by making  changes in technical education

Craig Smith in his article published in the New York Times, while suggesting fighting with Bias in technology, expressed his opinion, that we need to make serious changes in the ways people obtain knowledge in the field of technological science. He states we need to create reforms in technical education. Nowadays, education is based on an objective point of view. We need to make it on a more inter-disciplinary level and educational revision.

He declares we need to consider and agree with some important issues globally, while other problems should be discussed on the local level. We must create regulations and rules, manage authorities and specialists, supporting control of such algorithms and events. More various collecting of information is only a single criterion, but it will not address the artificial Bias problem.


Biases in all fields of our social, private, and professional life are very important issues. It is very hard to overcome them only by trusting the ordinary computation methods based on AI and standard assumptions. Bias phenomena can cause errors associated with the wrong interpretation of collected data by algorithms. This problem can lead to wrong results and bad productivity in science, production, medicine, education, and other spheres. It is necessary to fight biases using testing methods, creating fair systems, allowing the right human to interfere in the automated computation processing, and changing methods of education.

Power BI Python with Instances – What is the Best Way to Vizualise Python Code?

Power BI Python with Instances – What is the Best Way to Vizualise Python Code?

Python with Power BI
Python with Power BI

Python With Power BI

Actually, when you need to create a bilateral data analysis you can use Power BI by Microsoft. It is not only interactive but also can visualize your information for your business intelligence. So that is why it is called BI or in other words Business Intelligence. It is useful because you can use Python with it.

Using Python you can enhance Power BI programing language capabilities. Python dashboard for data analysis, data acceptance, data conversion, data addition, and data visualization, also can use complicated functions such as machine learning libraries and more. This all can be done with numerous data dashboards with Python thankfully to these two mechanics Power BI and Python. Or in other words, thanks to Python BI Power.

Either experts or newbies can use this blog to learn Python BI Power. We will also use pandas and Matplotlib libraries.

Actually, pandas is an open-source library that can be used for operating with relative and termed data simply. It provides an interactive dashboard Python made from data and some functions for working with numbers and times. The core of pandas is the NumPy library and it is efficient and it has high performance.

And matplotlib is one of the excellent libraries that can help you visualize Power BI drag and drop dashboard for arrays. It is useful when you need to create a visualisation of big numbers. And also it has a lot of settings for your plots such as lines, bars, and even histograms.

So the answer to the question: When can you use BI Power? is “You can use them with a sales data dashboard using Python.”.

The Start in Using Power BI

As it is logical, Power BI code language is created by Microsoft so the only operating system where it runs on is Windows. But do not worry, you can use it with macOS or Linux distribution or any other operating system if it supports virtual machines. The version of Windows that supports Microsoft BI Power is Windows 8.1 and newer. Actually, if you are not using Windows as your main operating system, you need to have 30 gigabytes for a virtual operating system.

Installation of Power BI Desktop

Here we will try to set all tools and after that, we can make codes using Python. Microsoft Power BI Desktop is a powerful collection of tools and services that can be obtained, free of charge, without a Microsoft account, and with no need for an Internet connection. You can easily install it on your computer by accessing the Microsoft Store from the Start menu or its web-based storefront. With this amazing suite, you can work offline like a traditional office suite, giving you the convenience of having all the tools and services you need in one place. By installing Power BI Desktop from the Microsoft Store, you can ensure automatic and quick updates to the most recent versions of the tool without having to be logged in as the system’s administrator.


If the usual method of installing Power BI Desktop doesn’t work for you, you can always try downloading the installer from the Microsoft Download Center and running it manually. This executable file is about 400MB in size. Once you have the application installed, launch it and you’ll be met with a welcome screen. At first, the Power BI Desktop user interface may seem intimidating, but don’t worry. You’ll become accustomed to the basics as you progress through the tutorial.

What Python Code Editor to Choose?

To maximize your experience, why not install Microsoft Visual Studio Code? This free and modern code editor is immensely popular and can be easily found in the Microsoft Store. If you already use an IDE such as PyCharm or don’t require any of the advanced editing capabilities, feel free to skip this step. Otherwise, Visual Studio Code is a great choice for any coding enthusiast.

Microsoft Power BI Desktop offers only basic code editing capabilities, which is understandable given its primary purpose as a data analysis tool. Unfortunately, it lacks advanced features such as intelligent contextual suggestions, auto-completion, or syntax highlighting for Python, which are essential for writing anything but the most easy Python scripts in Power BI coding language. Consequently, it is highly recommended that you use an external code editor for writing more complex Python scripts Power BI.

You can download Visual Studio Code on any operating system without virtual machines and you can find it on the Microsoft website. The installation is simple as Visual Studio helps you when you download the installer.

VS Code is a cutting-edge code editor that brings seamless support for a wide array of programming languages via its extensions. Actually, it does not Power BI Python support from the box as PyCharm, but it will offer to install extensions to start working. Say goodbye to limitations, as VS Code doesn’t just stop at Python for Power BI. With its intuitive interface, it becomes a Python powerhouse once you open an existing Python file or create a new one, automatically recognizing the language and prompting you to install the best set of recommended extensions designed specifically for Python programming.

But all this works only if you have already installed raw Python on your computer. You can google it and find the answer to how to do this.

Does BI Desktop Need Some Libraries?

The answer is “Yes, it does”. Unleash the full potential of Power BI Desktop by ensuring that your Python setup is equipped with pandas and Matplotlib. These libraries are not part of the standard installation, but can easily be obtained if you’ve utilized Anaconda. It’s worth noting that incorporating third-party packages into the global Python interpreter is discouraged, as it can pose potential risks. Moreover, attempting to run the system interpreter from Power BI Python on a Windows machine is not possible due to permission restrictions. The solution? A Python virtual environment is a secure and efficient way to manage your Python packages and dependencies.

What is a Virtual Environment?

An isolated folder, a virtual environment, comprises a directory that comprises a replication of the main Python interpreter, allowing you to experiment with your heart’s content. You can install any additional libraries within this space without any concern of disrupting other programs that are reliant on Python. And, whenever you wish, you can effortlessly eliminate the folder holding your virtual environment without any adverse effect on the existence of Python on your machine.

To create a new virtual space you need to use a Windows terminal and write the command:

python -m venv python-virtual

Here we named our folder “python-virtual” but you can use any you want.

In just a matter of moments, a fresh folder containing a duplicate of the Python interpreter will materialize on your desktop. With this, you can activate the virtual environment by executing its activation script, and subsequently install the two libraries that are required by Power BI. To achieve this, type in the following commands while your desktop remains in your present working directory:

python -m pip install pandas matplotlib

Upon activation, you should be able to identify your virtual environment with the name “python-virtual” in the command prompt. Failure to do so would result in the installation of additional third-party packages into the primary Python interpreter, which is precisely what we aimed to avoid. Congratulations, you’re almost there! You can repeat the activation and pip installation steps if you wish to incorporate additional libraries into your virtual environment. Finally, the next step is to inform Power BI of the location of Python in your virtual environment.

Let’s Run Python on It

Firstly, we need to set special options in Power BI Desktop. Once you access the configuration options, you’ll find various settings organized by categories. Locate the category named “Python scripting” in the left-hand column, and proceed to set the Python home directory by selecting the “Browse” button.

To ensure proper functionality, it’s important to specify the path to the “Scripts” subfolder in your virtual environment, which contains the “python.exe” executable. If your virtual environment is located in your Desktop folder, your path should resemble the following format:


Before “\Desktop” must be the name of your user. If the designated path is incorrect and does not contain a virtual environment, you will receive an appropriate error message.

Great job! You have now successfully configured Python to Power BI. One key setting to verify is the path to your Power BI to Python virtual environment. It should include both the pandas and Matplotlib libraries. With this setup complete, you’re ready to start exploring the capabilities of Python with Power BI in the next section.

So next let’s talk about running the code and how it works.

How Can You Operate With It?

There are several methods available to execute Python and Power BI, each of which seamlessly integrates with a data analyst’s regular workflow. One such method involves using Python as a data science using Power BI source to import or create datasets within your report. Another method involves utilizing Python to perform data cleaning and other transformations on any dataset directly in Power BI with Python. Additionally, Python’s advanced plotting libraries can be used to create engaging and informative data visualizations. This article will explore all three of these applications in detail.

Using pandas.DataFrame

If you need to ingest data into Power BI from a proprietary or legacy system, you can use a Python  BI script tool to connect to the system and load the data into a pandas DataFrame. This is a useful approach when the data is stored in an obsolete or less commonly used file format that Power BI does not support natively.

To get started, you can write a Python script that connects to the legacy system and loads the data into a pandas DataFrame. Once the data is in a DataFrame, you can manipulate it using the pandas library to clean and transform the data as needed.

Power BI can then access the DataFrame by connecting to the Python script and retrieving the DataFrame. This allows you to leverage the power of both tools – the data manipulation capabilities of Python and the visualization and reporting capabilities of Power BI.

In this tutorial, we’ll use Python to load fake sales data from SQLite, which is a popular file-based database engine. While it is technically possible to load SQLite data directly into Power BI Desktop using an appropriate driver and connector, using Python can be more convenient since it supports SQLite out of the box.

Before jumping into the code, it would help to explore your dataset to get a feel for what you’ll be dealing with. It’s going to be a single table consisting of used car dealership data stored in the “sales.db” file.

Let’s imagine this file has a thousand records and a lot of columns of data in the table, which represent sold goods, their buyers, and the date of sold items. May you remember what we mentioned about Anaconda? Yes, Anaconda has a Jupyter Notebook that can be called a code editor. You can quickly visualize this sample database by loading it into a pandas DataFrame and sampling a few records in a Jupyter Notebook using the following Power BI Python tutorial:

import sqlite3
import pandas as pand

with sqlite3.connect(r"C:\Users\User\Desktop\sales.db") as connection:
  df = pand.read_sql_query("SELECT * FROM sales", connection)


Note that the path to the sales.db file may be different on your computer. If you can’t use Jupyter Notebook, then try installing a tool like SQLite Browser and loading the file into it.

At a glance, you can tell that the table needs some cleaning because of several problems with the underlying data. However, you’ll deal with most of them later, in the Power Query editor, during the data transformation phase. Right now, focus on loading the data into Power BI.

As long as you haven’t dismissed the welcome screen in Power BI yet, then you’ll be able to click the link labeled Get data with a cylinder icon on the left. Alternatively, you can click Get data from another source on the main view of your report, as none of the few shortcut icons include Python. Finally, if that doesn’t help, then use the menu at the top by selecting Home › Get data › More… as depicted below.

Doing so will reveal a pop-up window with a selection of Power BI connectors for several data sources, including a Python script, which you can find by typing Python into the search box.

Select it and click the Connect button at the bottom to confirm. Afterward, you’ll see a blank editor window for your Python script, where you can type a brief code snippet to load records into a pandas DataFrame.

You can notice the lack of syntax highlighting or intelligent code suggestions in the editor built into Power BI. As you learned earlier, it’s much better to use an external code editor, such as VS Code, to test that everything works as expected and only then paste your Python code to Power BI.

Before moving forward, you can double-check if Power BI uses the right virtual environment, with pandas and Matplotlib installed, by reading the text just below the editor.

While there’s only one table in the attached SQLite database, it’s currently kept in a denormalized form, making the associated data redundant and susceptible to all kinds of anomalies. Extracting separate entities, such as cars, sales, and customers, into individual DataFrames would be a good first step in the right direction to rectify the situation.

Fortunately, your Python script may produce as many DataFrames as you like, and Power BI will let you choose which ones to include in the final report. Here in Power BI programming language code, you can extract those three entities with pandas using column subsetting in the following way:

import sqlite3
import pandas as pand

with sqlite3.connect(r"C:\Users\User\Desktop\sales.db") as connection:
  df = pand.read_sql_query("SELECT * FROM sales", connection)

goods = df[
sales = df[["sale_price", "sale_date"]]

First, you connect to the SQLite database by specifying a suitable path for the car_sales.db file, which may look different on your computer. Next, you run a SQL query that selects all the rows in the sales table and puts them into a new pandas DataFrame called df. Finally, you create three additional DataFrames by cherry-picking specific columns. It’s customary to abbreviate pandas as pd in Power BI coding. Often, you’ll also see the variable name df used for general, short-lived DataFrames. However, as a general rule, please choose meaningful and descriptive names for your variables to make the code more readable.

When you click OK and wait for a few seconds, Power BI will present you with a visual representation of the four DataFrames produced by your Python script.

The resulting table names correspond to your Python variables. When you click on one, you’ll see a quick preview of the contained data. The screenshot above shows the customers table, which comprises only two columns.

Select cars, customers, and sales in the hierarchical tree on the left while leaving off df, as you won’t need that one. You could finish the data import now by loading the selected DataFrames into your report. However, you’ll want to click a button labeled Transform Data to perform data cleaning using pandas in Power BI.

In the next section, you’ll learn how to use Python to clean, transform, and augment the data that you’ve been working within Power BI.

Using Python Query Editor

If you have followed the instructions in this guide, you should now be in the Power Query Editor, which displays the three DataFrames you selected earlier. These DataFrames are referred to as queries in this particular view. However, if you have already imported data into your Power BI report without applying any transformations, there’s no need to worry! You can access the same editor at any time.

To do so, navigate to the Data perspective by clicking on the table icon located in the center of the ribbon on the left-hand side, and then select Transform data from the Home menu. Alternatively, you can right-click on one of the fields in the Data view on the far right of the window and choose the Edit query for the same result. Once you have accessed the Power Query Editor window again, you will be able to see your DataFrames or Queries on the left-hand side, while the Applied Steps for the currently selected DataFrame will be displayed on the right-hand side, with rows and columns in the center.

Each step in the Applied Steps represents a sequence of data transformations that are applied in a pipeline-like fashion against a query, from top to bottom. Each step is expressed as a Power Query M formula. The first step, named Source, involves invoking your Python script, which generates four DataFrames based on the SQLite database. The other two steps extract the relevant DataFrame and transform the column types.

By clicking the gear icon next to the Source step, you’ll reveal your data ingestion script’s original Python source code. This feature can access and edit Python code baked into a Power BI report even after saving it as a .pbix file.

You can insert custom steps into the pipeline for more granular control over data transformations. Power BI Desktop offers plenty of built-in transformations that you’ll find in the top menu of Power Query Editor. But in this tutorial, you’ll explore the Run Python script transformation, which is the second mode of running Python code in Power BI:

Conceptually, it works almost identically to data ingestion, but there are a few differences. First of all, you may use this transformation with any data source that Power BI supports natively, so it could be the only use of Python in your report. Secondly, you get an implicit global variable called dataset in your script, which holds the current state of the data in the pipeline, represented as a pandas DataFrame.

Note: As before, your script can produce multiple DataFrames, but you’ll only be able to select one for further processing in the transformation pipeline. You can also decide to modify your dataset in place without creating any new DataFrames.

Pandas lets you extract values from an existing column into new columns using regular expressions. For example, some customers in your table have an email address enclosed in angle brackets (<>) next to their name, which should belong to a separate column.

Select the customer’s query, then select the last Changed Type step, and add a Run Python script transformation to the applied steps. When the pop-up window appears, type the following Python script code example:

dataset = dataset.assign(

When working with Power BI, you can utilize the implicit dataset variable in your script to reference the customer’s DataFrame, giving you access to its methods and allowing you to override it with your transformed data. Alternatively, you have the option to define a new variable for the resulting DataFrame. During the transformation process, you can add two new columns, full_name and email, and then remove the original customer column containing both information pieces.

Once you’ve finished your transformation, clicking OK and waiting a few seconds will display a table showing the DataFrames your script produced. In this case, there is only one DataFrame named dataset, as you reused the implicit global variable provided by Power BI for your new DataFrame. To choose your desired DataFrame, simply click the yellow Table link in the Value column.

Your customers` table now has two new columns, allowing you to quickly identify customers who have not provided their email addresses. If you desire further transformations, you can add additional steps. For example, you could split the full_name column into separate columns for first_name and last_name, assuming that there are no instances of customers with more than two names.

Be sure to select the final transformation step and insert another Run Python script. The corresponding Python code for this step should appear as follows:

["first_name", "last_name"]
] = dataset["full_name"].str.split(n=1, expand=True)
dataset.drop(columns=["full_name"], inplace=True)

Unlike in the previous step, the dataset variable refers to a DataFrame with three columns, full_name, and email, because you’re further down the pipeline. Also, notice the inplace=True parameter, which drops the full_name column from the existing DataFrame rather than returning a new object.

You’ll notice that Power BI gives generic names to the applied steps and appends consecutive numbers to them in case of many instances of the same step. Fortunately, you can give the steps more descriptive names by right-clicking on a step and choosing Rename from the context menu:

By editing Properties…, you may also describe in a few sentences what the given step is trying to accomplish.

When you’re finished transforming your datasets, you can close the Power Query Editor by choosing Close & Apply from the Home ribbon or its alias in the File menu:

This will apply all transformation steps across your datasets and return to the main window of Power BI Desktop.

Next up, you’ll learn how to use Python to produce custom data visualizations.

Power BI Python Data Transformation

So far, we’ve covered importing and transforming data using Python in Power BI Desktop. Python’s third and final application is creating visual representations of your data. When it comes to visualizations, you have the flexibility to use any of the supported Python libraries, provided you’ve installed them in the virtual environment that Power BI utilizes. However, Matplotlib serves as the foundation for plotting, which other libraries delegate to in any case.

If Power BI hasn’t already directed you to the Report perspective following your data transformations, you can now navigate by clicking on the chart icon on the left ribbon. This will bring up a blank report canvas where you can add your graphs and other interactive components, collectively referred to as visuals.

Over on the right in the Visualizations palette, you’ll see several icons corresponding to the available visuals. Find the icon of the Python visual and click it to add the visual to the report canvas. The first time you add a Python or R visual to a Power BI report, it’ll ask you to enable script visuals.

In fact, it’ll keep asking you the same question in each Power BI session because there’s no global setting for this. When you open a file with your saved report that uses script visuals, you’ll have the option to review the embedded Python code before enabling it. Why? The short answer is that Power BI cares for your privacy, as any script could leak or damage your data if it’s from an untrusted source.

However, if you’ve configured Power BI to use an external code editor, then clicking on the little skewed arrow icon (↗) will launch it and open the entire scaffolding of the script. You can ignore its content for the moment, as you’ll explore it in an upcoming section. Unfortunately, you have to manually copy and paste the script’s part between the auto-generated # Prolog and # Epilog comments back to Power BI when you’re done editing.

Note: Don’t ignore the yellow warning bar in the Python script editor, which reminds you that rows with duplicate values will be removed. If you only dragged the color column, then you’d end up with just a handful of records corresponding to the few unique colors. However, adding the vin column prevents this by letting colors repeat throughout the table, which can be useful when performing aggregations.

To demonstrate an elementary use of a Python visual in Power BI, you can plot a bar chart showing the number of goods painted in a given color. Here is an example of import visualize Python:

import matplotlib.pyplot as mat


series = dataset[dataset["color"] != ""]["color"].value_counts()
series.plot(kind="bar", color=series.index, edgecolor="black")


To get started with creating visualizations, you can begin by enabling Matplotlib’s theme that mimics the seaborn library. This will provide a more visually appealing look and feel compared to the default theme.

Next, you can remove any records with missing color data, and count the number of remaining records in each unique color group. This will result in pandas.Series object that can be plotted and color-coded using its index, which consists of the color names. Finally, you can render the plot by calling plt.show().

With these steps, you can easily create a basic visualization of your data using Python in Power BI. Of course, the possibilities for visualizing your data are endless, and you can explore and experiment with other Python libraries and techniques to create even more engaging and informative visualizations.

Additional Settings For Power BI Desktop

With the power of pandas and Python, there are countless possibilities for transforming your datasets in Power BI. Some examples include:

  • Anonymizing sensitive personal information, such as credit card numbers
  • Identifying and extracting new entities from your data
  • Rejecting sales with missing transaction details
  • Removing duplicate sales records
  • Unifying inconsistent purchase and sale date formats

These are just a few ideas to get you started, but the possibilities are endless. While we can’t cover everything in this article, don’t hesitate to experiment on your own. Keep in mind that your success in using Python to transform data in Power BI will depend on your understanding of pandas, which is the library that Power BI uses under the hood. The more you learn about pandas and their capabilities, the more you can achieve with your data in Power BI.

    Special Code Editor

    Within the Python scripting options in Power BI, a useful setting allows you to specify the default Python integrated development environment (IDE) or code editor you prefer to use when working on a code snippet. You can stick with the operating system’s default program associated with the .py file extension, or you can select a specific Python IDE of your choice to launch within Power BI. This flexibility can make it easier and more efficient for you to write and debug Python code directly in Power BI.

    To indicate your preferred Python Integrated Development Environment (IDE), opt for “Other” from the initial dropdown menu, and navigate to the executable file of your preferred code editor. For instance, you may browse this one:

    \Desktop\Programs\Microsoft VS Code\Code.exe

      As before, the path to your app can be different and contain different folders.

        What are the Cons of Using Python Power BI?

        Python integration in Power BI Desktop has some limitations you should be aware of.


          The most notable limitations are related to timeouts, data size, and non-interactive visuals. Your data ingestion and transformation scripts defined in Power Query Editor can’t run longer than thirty minutes. Python scripts in Power BI visuals are limited to only five minutes of execution, and there are additional data size limitations, such as only being able to plot the top 150,000 rows or fewer in a dataset and the input dataset can’t be larger than 250 megabytes.


            In Power BI Desktop, the communication between Power BI and Python is done by exchanging CSV files. Therefore, when using Python to manipulate data, the script must load the dataset from a text file created by Power BI for each run, and then save the results to another text or image file for Power BI to read. This redundant data marshaling can result in a significant performance bottleneck when working with larger datasets. It is the biggest drawback of Python integration in Power BI Desktop.

            If you encounter poor performance, you may want to consider using Power BI’s built-in transformations or the Data Analysis script Expression (DAX) formula language instead of Python. Another approach to improve performance is to reduce the number of data serializations by collapsing multiple steps into a single Python script that does the heavy lifting in bulk. For example, instead of making multiple steps in the Power Query Editor for a very large dataset, you can combine them into the first loading script.

              Python Visualization

              Data visualizations created using Python code are static images, which means you can’t interact with them to filter your dataset. However, Power BI will update the Python visuals in response to interacting with other visuals. It’s worth noting that Python visuals take slightly longer to display due to the data marshaling overhead and the need to run Python code to render them.

                And Others

                Using Python in Power BI has some other minor limitations. For instance, it can be difficult to share Power BI reports that rely on Python code with others, since the recipients would need to install and configure Python. Additionally, all datasets in a report must be set to a public privacy level for Python scripts to work properly in the Power BI service. Furthermore, a finite number of supported Python libraries in Power BI exist. There may be additional minor limitations, which can be found in Microsoft’s documentation that outlines how to prepare a Python script and known limitations of Python visuals in Power BI.

                A Beginner’s Guide to Anonymous Types in C#

                A Beginner’s Guide to Anonymous Types in C#

                Anonymous Types C#
                Anonymous Types C#

                What are Anonymous Types in C#

                Anonymous types are a mighty instrument in object-oriented programming disciplines. In strongly typed programming languages like C#, we need to always define the type of a variable before we can create new ones. Sometimes we have to create a new instance of an object with an unknown type and read-only attributes, which can be done only with explicitly undefined types.
                By clearly describing small things, they can encapsulate attributes into an object without defining a type. And every single object can hold any combination of attributes, regardless of their data type, as long as each attribute is clearly defined and described.
                They are perfect for the execution of SQL-like LINQ because they return an unspecified type to define the new object on the fly. It can help you streamline your C# and .NET code while also providing developers with a powerful tool for creating unidentified anonymous objects with a predefined structure.

                How to Use Anonymous Types

                Anonymous typing simplifies development and produces more readable and clean code. In combination with generics, you can create strongly typed collections, which can be useful with LINQ queries, passing attributes, or creating models in ASP.NET MVC.
                Anonymous typing doesn’t work with inheritance or interface implementation. If you need a more complex type that requires functions or fields, your best choice is to define a class.

                C# Anonymous Types Example

                We made an array called “hackers” that has two objects in it with the values “Type” and “Exp“, a list called uniqueHackers, and a hash set called seenTypes.

                The code iterates through the hackers array, and for each unique type, finds the maximum experience among all hackers. Finally, it creates a new object with a unique type and maximum experience and adds it to the uniqueHackers list.

                This code will find the hacker(s) with the highest experience for each type of hacker. Note that it is a list of anonymous types, so you won’t be able to access the properties directly by name. Instead, you can use reflection or dynamic typing to access the attributes.

                var hackers = new[] {
                  new { Type = "Anonymous", Exp = 3 },
                  new { Type = "Anonymous", Exp = 5 }
                var uniqueHackers = new List<object>();
                var seenTypes = new HashSet<string>();
                foreach (var person in hackers) {
                  if (!seenTypes.Contains(person.Type)) {
                    var maxExp = person.Exp;
                    foreach (var otherPerson in hackers) {
                      if (person.Type == otherPerson.Type &&
                otherPerson.Exp > maxExp) {
                        maxExp = otherPerson.Exp;
                    uniqueHackers.Add(new { Type = person.Type,
                Exp = maxExp });

                Nested Anonymous Type in C#

                Anonymous types can’t be used outside of the scope where they are defined, so they may not be suitable for more complex scenarios or for sharing data across multiple methods or classes.

                This code is more concise and easier to read, especially when dealing with simple data structures. Here’s a code example of how you could create a nested anonymous type from the people array in the original code. It’s important to note that code readability can suffer if there are too many nested levels. In such cases, it may be preferable to utilize named types or nested classes instead.

                var hackers = new[] {
                  new {
                    Type = "Anonymous",
                    Exp = 3,
                    Address = new {
                      Street = "123 Main St",
                      City = "Anytown",
                      State = "CA",
                      ZipCode = "12345"
                  new {
                    Type = "Anonymous",
                    Exp = 5,
                    Address = new {
                      Street = "456 Elm St",
                      City = "Sometown",
                      State = "CA",
                      ZipCode = "67890"

                In this code example, the people array contains two objects, each with Type, Exp, and Address, which itself is an anonymous object with Street, City, State, and ZipCode properties. When dealing with complex data structures, we can utilize dot notation to access the nested properties like this:

                // Output: Anonymous
                // Output: 123 Main St
                // Output: Sometown

                Anonymous Types in LINQ

                LINQ can significantly simplify the code and enhance its readability when you work with complex data structures. To create an anonymous array using a LINQ query, modify the code as follows:

                This snippet creates a new anonymousArray with Type and Exp properties. This code uses LINQ to sort by maximum experience, then creates an array with unique hackers, similar to the original code. There we have concise and readable code while still achieving the desired functionality.

                Note that the order of the elements in uniqueHackers is not guaranteed, so we set the properties of each element in anonymousArray explicitly to ensure they match the original hackers array.

                var hackers = new[] {
                  new { Type = "Anonymous", Exp = 3 },
                  new { Type = "Anonymous", Exp = 5 }
                var uniqueHackers = hackers
                  .GroupBy(h => h.Type)
                  .Select(g => new { Type = g.Key, Exp =
                g.Max(h => h.Exp) })
                var anonymousArray = new[] {
                  new { Type = uniqueHackers[0].Type, Exp =
                uniqueHackers[0].Exp },
                  new { Type = uniqueHackers[1].Type, Exp =
                uniqueHackers[1].Exp }

                Infrastructure As Code Tools Role│Best IaC Tools

                Infrastructure As Code Tools Role│Best IaC Tools

                Infrastrucrure As a Code Tools

                Why Infrastructure As A Code Tools Used In Cloud Platforms?

                IaC is a kind of methods used to control and describe centers of data processing with data sets for configuration rather than using manual methods of editing configurations on servers or interrelationships of infrastructures. Normally, the declarative method or imperative algorithm is used in writing code for operating with infrastructures.

                Infrastructure as code (IaC) is a common set of tools used in cloud computing. The core principle of the IaC is to describe an infrastructure with a code in combination with an ordinary software development procedure. It is the main practice of developers and a component of a continuous software supply. IaC allows DevOps to work as a team promptly, reliably, and on large scales. They can use a single set of methods and tools to develop programs and servicing infrastructure.

                From Hardware to Cloudformation tools

                Many people probably no longer remember the iron age when we had to buy our own servers and computers. At that time no one had any idea what tools for cloud formation were. Now it already seems crazy when the hardware buying cycle can limit the infrastructure growth. A new server used to be delivered and installed for weeks! The software was available to developers many days after the hardware was installed.

                The first cloud computing methods appeared only in the middle of the 2000s. This made it possible to run new instances of virtual machines quickly and brought businesses and developers not only benefits but also problems. First and foremost they had to maintain an increasing number of servers. However, these were still far from IaC tools.

                Following that, only some large computers began to be replaced by smaller ones and the area of the infrastructure of an average engineering center began growing and became more cyclical. Ops had to support more and more things. In order to cope with the peak load, it was necessary to make up or down scaling at different times of the day.

                To increase efficiency, many pickets had to be created in the morning to achieve the maximum power and many also at night to reduce that power. The whole process had to be managed manually, which became a challenge over time.

                All abovementioned was the reason for the creation and introduction of the infrastructure as code tools. This allowed the systematization of the listed above task maximally. IaC solution made the management of the data processing centers and servers very sufficiently with the help of data readable by computers. They became an alternative to physical equipment and tool configuration under human supervision.

                Amazon Web Cloud Formation Service (AWS) was the first tool that emerged in 2009. It became one of the best tools for DevOps allowing engineers to create versions of infrastructures as quickly as a normal code can make it as well. And that allows tracing the resulting versions of infrastructures in order, to make environments enough consistent.

                Well-known Infrastructure As Code Tools

                The TOP IaC tools that become famous recently among developers are:

                • Terraform IaC
                • Amazon Web Service Cloudformation tools (AWS)
                • Azure Resource Manager
                • Ansible
                • Chef
                • Puppet
                • Pulumi
                • Saltstack
                • Google Cloud Deployment Manager
                • Vagrant
                • Crossplane

                Now, we would like to compare all aforementioned infrastructure as code tools to understand what similarities or differences they have concerning the application area, writing method, or languages.



                AWS CloudFormation

                Azure Resource Manager

                Google Cloud Deployment Manager



























                Declarative and imperative




                Declarative and imperative


                HashiCorp Configuration

                YAML or JSON


                YAML or Python

                Typescript, Python, or Go





                Ruby, PHP, C#, Python, Java, JavaScript


                Applied for

                Web and cloud formation services

                Amazon Web Services

                Access Control based on Role

                Google Cloud resources and platforms

                Azure Cloud services, WS, GCP

                Users of modules and plugins

                Cloud providers and web services

                Cloud platforms and web services

                Almost all cloud providers present on the market, architecture and cloud field

                Engineers preferring few virtual PCs to big cloud-based infrastructures

                Universal tool, fits any platform

                Main Infrastructure As Code Tools Tasks

                At the present time, it is hard to imagine the work of major providers and services without a cloud automation tool application. A wide range of IaCs is dedicated to helping IT engineers to solve such challenges as:

                • Deployment
                • Instrumentation
                • Configuration
                • Provisioning

                Earlier IT specialists set, configured, and updated software for cloud servers manually. Team participants stored and configured data also with the same method. It took much time and required the attraction of additional developers and influenced significantly the increase in expenses.

                IaC infrastructure as code became a solution for professionals in addressing such problems as additional expenses for salary payment and solving problems with the scalability.

                It is worth being aware that some IaC tools are already set inside the settings of the infrastructure and other kinds of tools manage applications and infrastructure in the environment.

                Below, we would like to give a few words about AWS infrastructure as code and its advantages.

                AWS Infrastructure As Code Advantages

                The IaC is aimed at the provision and management of cloud resources using a template read by people, which can be consumed by machines in an easy way. AWS Cloud formation is considered a reliable solution to the DevOps cloud services, which uses the IaC for Amazon Web Services.

                AWS type of cloud formation enables creating a personal account of a user on Amazon Web Services using the description requested by a user. Then, this description is realized upon request. A typical infrastructure as code example includes a fragment of the template, which describes the creation of resources for the Amazon Elastic Computed Web Services using YAML.

                Therefore, when we create the code, we indicate the AWS, then ECS and Service gradually as a Type, then put the “Discovery of the service” as the Dependence, also indicate in the properties the “App” as a name, “Production” as the cluster, 200 maximal percent and 75 minimal percent in the deployment configuration and set 5 as the number of counts.

                AWS cloud formation tools then take the template and after that becomes responsible for the creation, updating, and removing resources on a user’s Amazon Web Service account depending on the content of this template. If a user wants to add a new resource to his file, the Cloudautomation tool builds this resource in his account. In case this user wishes to update his recourse, the tool can update or replace all current existing resources. If a user wants to delete this resource from his template, it will be vanished or be removed from his account.

                Tools IaC provides users many pros:

                They are visible:

                An IaC template plays a big role as a precise reference on what kinds of resources you have on your account and their indicators. To check settings there is no need to follow the web panel.

                They are scalable:

                You can write the infrastructure as a code one time and then use it multiple times. It means that you can use just a good quality template as the basis for different services in various areas of the world, which significantly simplifies horizontal scaling.

                They are stable:

                If a wrong parameter or a wrong resource has been removed from the web panel, you can break everything. IaC tools solve this problem, especially in combination with Git versions for control.

                They have transaction ability:

                Cloudformation tools can not only create resources in your AWS account, but they also wait until their stabilization during the starting process. IaCs check for a successful initialization and in case of any failure, they can roll carefully the infrastructure back to the previously known good condition.

                They are secure:

                It can be seen that the provides of IaC again you with a single template to deploy your architecture. As soon as your protected architecture has been created, you can use it many times and you will know that each deployed version can have the same settings.


                IaCs are very popular instruments of the new generation introduced at the beginning of the new century to make the process of cloud service formation, deployment, and adjusting of the infrastructure easier using just a code. There is no need to make manual settings, which significantly simplified the tasks of developers and solve a problem with scalability as well. Terraform and its closest analog Pulumi is considered the most common tools used for cloud formation.



                What problems do the infrastructure as code tools solve?

                IaC tools are developed to fight such inconveniences as manual configuration of software for cloud and services. Engineers who didn’t have IaC were maintaining the settings of each environment of deployment separately. With time, each environment itself becomes a kind of unique configuration. Professionals call it a “snowflake”. Therefore, they cannot be reproduced automatically.

                When environments do not fit each other, it causes problems with deployment. Administration and support of the infrastructure are always associated with manual adjusting leading to some errors, which are difficult to track. IaC tools enable avoiding a configuration process manually and make the environments consistent ensuring the desirable conditions for them with a qualitative code.

                Why Terraform is always number one among the IaC tools?

                Terraform is considered the best tool for DevOps and the most demanded in the market. It is an open-source IaC solution, which is very flexible and can support all the most promising and safe cloud services, such as Azure, GCP, or AWS.

                It can also maintain many various cloud providers and manage them within a single workflow as it may destroy resources while sources are saved.

                Terraform is considered a very cost-saving infrastructure as a code instrument as it is open-source by its nature and possesses a great range of quality tools and scripts.

                What is the best Terraform alternative tool?

                There are many tools IaC that are similar to Terraform in their approaches, usage, and description methods, which are commonly used in the Cloudformation platforms. However, it is worth highlighting the Pulumi IaC tool. Many specialists state that it has the same ability to create, manage and deploy infrastructure on any cloud. It is free and open-source like Terraform.