Background

While juggling school work and a full-time internship, I decided to represent my school in the WorldSkills competition. It was a decision I never regretted, and I appreciated the opportunity to explore the world of Cloud Computing.

However, it was not without difficulties. The competition dates coincided with the Circuit Breaker, disrupting the original schedule and causing interruptions in the plans of the original team members. Two other members from the initial team who had to attend national service could no longer participate, forcing us to form a new team.

Initial Timeline

10th Jan 2020 - First Meet Up
14th Feb 2020 - Official Photographs
28th Feb 2020 - Participants informed of competition dates
16th Mar 2020 - Official briefing from SP
24th Mar 2020 - Familiarization at NYP for 16 competitors
25th Mar 2020 - Start of competition
27th Mar 2020 - End of competition

This should have been the timeline if all went well, but it was a period of sharp increase in new COVID-19 cases in Singapore. Then, came the news of the lockdown: "Straits Times: All entertainment venues in Singapore to close, gatherings outside work and school limited to 10 people". Hours after the news, we were informed that the competition had to be postponed until further notice.

New Timeline

22nd May 2020 - Participants informed of new competition dates
30th Sep 2020 - Start of competition
02nd Oct 2020 - End of competition

After the announcement of the new competition dates, we looked for new members to join the team. Over the next few months, two of my juniors joined, and the original team and mentors continued to contribute, participate in discussions, and transfer knowledge. The new members worked hard alongside the mentors, and within two months, they were able to become certified as AWS Solutions Architect Associates.

To reflect on my experience in the competition, I have decided to write a blog post about it. I hope that this will also pique the interest of others in the world of Cloud Computing.

Reflections - Area of Improvements

As I walked into the venue on the first day of the competition, I was taken aback for a moment when I noticed that there were university students among the competitors. I had assumed that it was a competition exclusively for polytechnic students.

Aside from that, the challenges were set up using AWS JAM as the platform for the competition. We had to solve multiple challenges of varying difficulty levels. I had expected a more traditional approach to the competition - questions given on paper, in black and white, or something similar. Jokes aside, I enjoyed the modern platform for solving the challenges and did not encounter any issues.

Looking at the challenges, I was surprised to see categories that I did not expect, such as security and forensics. I had thought that the scope would be limited to Solution Architect. As I regained my composure, I realized that this was to be expected, and as a software engineer, I should be able to work on solving challenges without prior knowledge, just as I would have to search for solutions when encountering an unforeseen bug.

As an avid learner, I didn't waste any learning opportunities and took note of some of the challenges for further research. I will be going through in detail some of the struggles I had and some areas of improvement that could have been made.

Overall Scope of Competition

  1. Network
  2. Data Recovery
  3. Access Control
  4. Compliance
  5. Security and Forensics
  6. DevOps

Day 1

After the timer started and the leaderboard with scores displayed at the front of the room, four challenges were revealed, with difficulty levels specified. Naively, I thought that only four challenges would be given for the entire day - from 9 AM to 5 PM, including breaks, with six hours of total time to work on the challenges. In hindsight, I should have expected more questions to be given after the break and not get caught off guard.

Although I felt confident and had no issues with the easy and medium-level challenges, I took quite some time as I wanted to be meticulous and read through every detail. However, more experienced competitors from NYP and NP finished the challenges quickly, and the facilitators added additional challenges in case of a tie-breaker.

With the addition of two hard-level challenges (though non-mandatory), I became more wary of the time left and started rushing through the challenges. For one of the mandatory hard challenges, I somehow thought that 'alphanumeric' consisted of just alphabets and crafted the initial command needed to solve the challenge as:

grep -R -E [A-Z]{20}

After some time, I regained my sanity and figured out that the command should be ALPHANUMERIC:

grep -R -E [A-Z0-9]{20}

Although I eventually solved it, I discovered after the competition ended that I could have saved much more time if I had known that there are existing tools that would have done the job of getting the AWS secrets that I needed. One such command is:

git-secrets --scan

Attribute-based Access Control

This is one of the additional, hard-level challenges that I plan to explain in detail.

https://cdn.sanity.io/images/9poqf6md/production/ee01fdf7c56abde6a5d30a0e36a1f7c8988c0863-1600x900.jpg

This challenge is mainly about IAM policies, which I had not practised enough, and I spent almost half of the remaining time trying to solve it using the visual editor. Despite doing that, I still couldn't get it, and due to the time constraints, I took the clues. I overthought the challenge and didn't have to add any new policy blocks, just slightly amend the existing rules.

Initial IAM Policy

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": "ec2:RunInstances",
            "Resource": [
                "arn:aws:ec2:*:875780615538:network-interface/*",
                "arn:aws:ec2:*:875780615538:volume/*",
                "arn:aws:ec2:*:875780615538:security-group/*",
                "arn:aws:ec2:*:875780615538:key-pair/*",
                "arn:aws:ec2:*:875780615538:subnet/*",
                "arn:aws:ec2:*::image/*",
                "arn:aws:ec2:*:875780615538:instance/*"
            ],
            "Condition": {
                "StringEquals": {
                    "aws:ResourceTag/Project": "Red",
                    "ec2:ResourceTag/Project": "Red",
                    "aws:RequestTag/Project": "Red"
                }
            }
        },
        {
            "Sid": "VisualEditor1",
            "Effect": "Allow",
            "Action": "ec2:CreateTags",
            "Resource": "arn:aws:ec2:us-west-2:875780615538:instance/*",
            "Condition": {
                "StringEquals": {
                    "aws:ResourceTag/Project": "Red",
                    "aws:RequestTag/Project": "Red",
                    "ec2:CreateAction": "RunInstances"
                },
                "StringLike": {
                    "ec2:ResourceTag/Project": "Red"
                }
            }
        },
        {
            "Sid": "VisualEditor2",
            "Effect": "Allow",
            "Action": [
                "ec2:StartInstances",
                "ec2:StopInstances"
            ],
            "Resource": "*",
            "Condition": {
                "StringEquals": {
                    "aws:PrincipalTag/Project": "Green",
                    "ec2:ResourceTag/Project": "Green",
                    "aws:RequestTag/Project": "Green"
                }
            }
        },
        {
            "Sid": "VisualEditor3",
            "Effect": "Deny",
            "Action": "ec2:DeleteTags",
            "Resource": "arn:aws:ec2:us-west-2:875780615538:instance/*"
        },
        {
            "Sid": "VisualEditor4",
            "Effect": "Deny",
            "Action": "ec2:CreateTags",
            "Resource": "arn:aws:ec2:us-west-2:875780615538:instance/*",
            "Condition": {
                "StringNotEquals": {
                    "ec2:CreateAction": "RunInstances"
                }
            }
        }
    ]
}

Final IAM Policy

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "01AllowStopStartWithProjectTag",
      "Effect": "Allow",
      "Action": [
        "ec2:StopInstances",
        "ec2:StartInstances"
      ],
      "Resource": [
        "arn:aws:ec2:*:*:instance/*"
      ],
      "Condition": {
        "StringEquals": {
          "ec2:ResourceTag/Project": "${aws:PrincipalTag/Project}"
        }
      }
    },
    {
      "Sid": "AllowRunInstancesResourcesNoTags",
      "Effect": "Allow",
      "Action": "ec2:RunInstances",
      "Resource": [
        "arn:aws:ec2:*::image/*",
        "arn:aws:ec2:*:*:subnet/*",
        "arn:aws:ec2:*:*:network-interface/*",
        "arn:aws:ec2:*:*:security-group/*",
        "arn:aws:ec2:*:*:key-pair/*"
      ]
    },
    {
      "Sid": "02AllowRunInstancesWithProjectTag",
      "Effect": "Allow",
      "Action": [
        "ec2:RunInstances"
      ],
      "Resource": [
        "arn:aws:ec2:*:*:instance/*",
        "arn:aws:ec2:*:*:volume/*"
      ],
      "Condition": {
        "StringEquals": {
          "aws:RequestTag/Project": "${aws:PrincipalTag/Project}"
        },
        "ForAllValues:StringEquals": {
          "aws:TagKeys": [
            "Project"
          ]
        }
      }
    },
    {
      "Sid": "03AllowCreateTagsOnRunInstances",
      "Effect": "Allow",
      "Action": [
        "ec2:CreateTags"
      ],
      "Resource": [
        "arn:aws:ec2:*:*:instance/*",
        "arn:aws:ec2:*:*:volume/*"
      ],
      "Condition": {
        "StringEquals": {
          "ec2:CreateAction": [
            "RunInstances"
          ]
        }
      }
    }
  ]
}

Comparing the initial IAM policy that I used to the final one, you will notice that I added overly specific rules. It could have been done much better.

To improve in this area, I needed more practice creating IAM policies. If I had more opportunities to use IAM policies and worked with them more often, I would have been able to solve this challenge quickly. During my internship, I had some access to IAM policies stored in Terraform. Although I wanted to learn more about them, I was occupied by other urgent DevOps tasks. Also, as I was not an admin, instead of being the one to create the IAM policies, I was mostly assigned IAM permissions when I requested them, which limited my experience.

That being said, without the DevOps experience that I gained during my internship, I would not have performed as well. I would like to take this opportunity to thank my supervisor for accepting my requests to work on DevOps tasks even though I was hired as a Software Engineer.

Lone Man Forensics

That was the title of the other additional hard-level challenge that unfortunately, I did not manage to finish due to the lack of time. I will only give a brief description and not go into the details of this challenge. I have noted down the solution to the challenge for future research and might come back to provide an update on what can be done better.

This challenge involves an EC2 instance that holds a folder, with the folder name being the key to solving the challenge. However, access to the EC2 instance is completely blocked. To access the EC2 instance's contents and locate the folder, one needs to create a snapshot of the EBS, generate a new volume from the snapshot, and attach it to another instance that has access.

Furthermore, this challenge requires using Autopsy. Even if I had more time, I would probably only have been able to complete the EBS part and attach it to the forensics instance with Autopsy installed, and I would have to search and learn how to use Autopsy. I felt somewhat frustrated as I thought that this was beyond the scope and favoured those who had studied Cybersecurity. In fact, one of my juniors was able to solve it by using a Reverse Shell to obtain the folder name.

Day 2 Challenges

  1. Data Recovery - Lost License (Easy)
  2. Data Recovery - A Nose For Secrets (Easy)
  3. Access Control - Privilege Escalation Control (Medium)
  4. Compliance - Automate Compliance Monitoring for the Win (Hard)
  5. Network Isolation - Isolation is Not Always Possible (Hard)
  6. Network - Stop Confidential Data Leaking (Hard)

I found the challenges on Day 2 more interesting than those on Day 1. Although there were a few data recovery challenges, they were relatively simple and mostly involved setting policies for services like S3 to enable file viewing permissions. I can't recall the details exactly, but that was the gist of it.

Moving on, the medium-level challenge focused on IAM, which I am weak at due to lack of practice. It is relatively easy to solve if you pay attention to the SIDs of the policy as they tell you what to do. For instance:

“sID:NEEDTOUPDATEONLYTHEADDCONDITIONBLOCKFORBOUNDARY”

Initially, I struggled and added a condition block using "stringEquals" instead of "stringNotEquals."

https://cdn.sanity.io/images/9poqf6md/production/cfcf22968ed1bdb35b24ba04a89c2137d608d7bd-1158x544.png

I could have done better if I had paid more attention to the questions and noticed the sID. Even without the sID, I could have spent more time reading the questions to better understand the given scenario. I rushed into the challenges on Day 2, knowing that time would not be sufficient as more questions would be added after the break.

I was able to complete the two hard challenges smoothly. For the compliance challenges, although I had no prior experience with AWS Config, it was well documented and relatively easy to pick up using AWS Config, with a lot of common sense. Three configs were created, and two were custom configs with AWS Lambda functions.

For the Network Isolation challenge, I regretted my approach. It was afternoon, and I was frustrated that more networking challenges were still coming in. I'm quite familiar with Security Groups and NACL due to work, but for some reason, I just totally forgot about NACL during that particular period while trying to attempt this challenge.

Due to my frustration, I opened the first clue and realized that I had overlooked NACL while trying to secure the SG. After configuring the NACL using the default NACL rules, which deny everything and not permitting any inbound or outbound rules for SG, network traffic was still coming in from port 80 and some other random ports from a few instances out of a total of six.

I spent a lot of time trying all the possible solutions before opening the next clue and realized that I had not restricted the NACL. In the end, I isolated all the nuances and realized that one particular instance still received network traffic from port 80, and that was the answer I needed.

I used the AWS CloudWatch Logs Insights query to filter only dstAdrr and dstPort with action=Accept. I regretted rushing to check the clue as I subconsciously suspected that I had overlooked some important rules.

Network Challenge

Lastly, I will provide some details about the one network challenge that I suspect, based on the leaderboard, none of the participants could solve.

https://cdn.sanity.io/images/9poqf6md/production/27f012f6b3715422378c78d7c58652d360caafc2-1600x713.png

Situation:

A security incident occurred last week. There are 2 VPCs(AppVPC1,2) that have forward proxy for connecting to the outside. All current proxy have software vulnerability. Our security team decided to build a new "integrated forward proxy" and disable all the current proxies.

Current Architecture

  1. Each VPC(AppVPC1 and AppVPC2) has a proxy server that has a software vulnerability. You need to stop all proxy server in the VPCs after launching new integrated foward proxy servers.
  2. Currently, all applications connect to the Internet via proxy server in their VPC.
  3. Current proxy servers are listening on only port 8080. Their access logs are stored in CloudWatch Logs.(/JAM/proxy/app/accesslog)
  4. All VPC's CIDR block are the same, so you can not use VPC Peering.
  5. All applications connect to the proxy server using the proxy's DNS name that is registered in the DNS Service.
  6. The security team has already built a new VPC( CommonVPC) for launching integrated forward proxy. NAT Gateway is deployed in the public subnets.
  7. The security team has also built a "Launch Configuration" of AutoScaling for the integrated forward proxy.
  8. AppVPC3 has a web server that is accessed from applications in AppVPC1 and AppVPC2 via the Internet.

Requirements

  1. Create a new "integrated forward proxy" in the private subnet in the CommonVPC and stop the current forward proxy servers after finishing all configuration.
  2. All applications need to access to the Internet and AWS API via integrated forward proxy servers.
  3. In addition, all applications need to connect to the AWS API without Internet access(You need to configure VPC Endpoints). You need to analyze access logs(/JAM/proxy/app/accesslog) of current proxy for confirming the AWS API that is currently used.
  4. Don't launch EC2 by manually. Please configure AutoScaling for removing any single points of failure for the integrated forward proxy.
  5. You can not use VPC Peering in this environment, please create VPC Endpoint Services(Private Link).
  6. From AppVPC1,2, S3 Bucket's access need to be limitted to reduce data leakage lisk.
  7. From your VPC, only s3:GetObject need to be allowed and other S3 method need to be prohibited.

Upon first glance at the challenge, I launched the integrated forward proxy servers with ASG and ELB. Then I launched the VPC endpoints of CommonVPC for the four required services. However, I got stuck at the VPC Endpoints Sharing (Private Link) stage, and at this point, I only had about 10 minutes left. So I opened the clue to learn about it. But I was still stuck at the Route53 configuration as I could not access it using the new console UI. According to other participants, it was accessible using the old console UI.

I could elaborate more on the approach to this challenge, but figured that it will not be beneficial, as one will have to set up the entire infrastructure to better understand it. I will end it here by providing some reference materials and a not-so-refined diagram that I and one of my juniors came up with to use as a reference for our recap discussion.

Reference Materials

  1. VPC Peering
  2. VPC Endpoint Services(Private Link)
  3. VPC Endpoint (Interface)
  4. CloudWatch Logs Insights
  5. Session Manager
https://cdn.sanity.io/images/9poqf6md/production/fb0e9e28fd1c66e3710b05d4cedbcbd9d33a36ad-1312x702.png

Day 3

https://cdn.sanity.io/images/9poqf6md/production/916c9ea27981a03debf5909ca247656c7c95010d-1574x653.png

The last day's questions were doable. For the DevOps challenge, I had to search for the documentation as I had no prior experience using AWS CodePipeline.

On this day, I completed the hard challenges first, as the RDS and Aurora IAM challenge took a long time to load. I completed it early in the competition, around 11 AM. For the Forensics challenge, I had no choice but to open all the clues for the last hard challenge, as I was left with only 5 minutes. If I had more time, I could have eventually found the answer by grepping the key, as I had already completed all the other steps needed.

After the competition, I had a discussion with one of my juniors, who was experienced in cybersecurity. He knew that the files would likely be in the /tmp folder and was able to solve the challenge easily.

Well, I have nothing more to mention about Day 3, other than it being all about experience. If I had worked with CodePipeline a few times, I would have saved a lot of time and had more time to work on the Forensics challenge. Although most challenges were manageable, as the AWS services used were well documented, experience matters a lot for the speed of completing certain challenges. For example, as I have worked with RDS extensively, I was able to complete the RDS challenge quickly after it was loaded. But it took me quite some time for the CodePipeline and Forensics challenges, which I was not familiar with.

Overall Takeaway

If I were to participate in such a competition again, I would make sure to remain calm and take time to read the questions thoroughly. If I am stuck, I would remind myself to go back and read the questions again before rushing to the clues. I would avoid frustration and keep my composure.

I would also start with the hard challenges first, as I could concentrate better in the morning. Near the afternoon, my energy level drops a lot, and that could be the reason behind instances where I could not quickly solve the questions.

Although practical experiences working with the AWS services are important, during my free time, I should read through the documentation and case studies on AWS blogs. There are various good examples, and it is beneficial to be well-read in the case studies.

Overall, it was a good experience, and I am glad to have had such an opportunity. It was a good learning experience, no matter the outcome, and I appreciate being awarded the Medallion for Excellence. At the end of each day, I made sure to take down notes and look at the solutions to reanalyze and conduct further research. I aim to read more of the case studies to gain more insights about having different approaches to certain scenarios, for example, in using VPC Endpoints and Private Link. I would like to end this by thanking my mentors from SP for their guidance and dedication to the team.