138 items found for ""
- La alegría de encontrarnos
Team Building El año 2020 trajo consigo grandes cambios en nuestra forma de vida. La inesperada y repentina crisis generada por la pandemia del COVID 19 tuvo profundas implicancias en nuestra rutina laboral. La modalidad de trabajo remoto o “home office” fue una tendencia forzada por la nueva realidad, cambiando súbitamente las costumbres y riquezas que trae consigo compartir una oficina. Modificamos nuestra manera de concebir el trabajo, nos adaptamos, pero siempre añoramos volver a nuestros ambientes laborales para recuperar lo más valioso que habíamos perdido, la interacción social y la comunicación cara a cara. A medida que las disposiciones legales se fueron flexibilizando, Teracloud volvió a abrir las puertas a su team. Disfrutamos de volver a encontrarnos, de compartir, de poder movernos libremente por las calles...pero todavía faltaba superar las burbujas laborales y encontrarnos todos juntos. Entendiendo esta necesidad y el deseo de sus empleados, en diciembre de 2021, Teracloud organizó la “Teraweek”, reuniendo a la mayor parte del team en su oficina central de Latinoamérica. Llegaron a Córdoba, “teraclouders” de Quilmes (Buenos Aires), de General Fernández Oro (Río Negro), de Tandil (Buenos Aires) y de Montevideo (Uruguay). Durante toda la semana, compartimos no sólo la jornada laboral, sino también desayunos llenos de anécdotas, almuerzos con largas sobremesas, cenas, salidas y juegos repletos de risa ¡Nos reencontramos! El broche de oro de esa semana, después de la alegría de volver a vernos algunos y conocernos otros, fue el viernes 17 de diciembre donde Teracloud nos sorprendió con una jornada recreativa al aire libre. "De la Teraweek me voy sin palabras. Siendo uruguayo y que me hayan dado la oportunidad de estar acá... me voy agradecido" - Rodrigo, DevOps Engineer Teracloud. Partimos a las sierras de Córdoba, con destino a la estancia Acuarela del Río, ubicada a orillas del río San Pedro en San Clemente (departamento Santa María). Allí, disfrutamos del río, hicimos caminatas a campo traviesa, practicamos yoga (o eso intentamos), nos desafiamos en campeonatos de pool, hablamos… hablamos mucho, nos disfrutamos, nos reencontramos y, hasta algunos, nos emocionamos. A pesar de que la pandemia continúa y siguen los desafíos a nivel personal y profesional, después de esta gran experiencia, comenzamos el 2022 con la certeza de que pronto volveremos a reunirnos y teniendo la seguridad de que Teracloud continúa fomentando el espíritu de equipo, acompañándonos y comprendiendo las situaciones personales que trajo consigo esta nueva realidad. "La Teraweek fue súper importante para conocernos. En ningún momento dudé en venir. Al contrario, estaba súper motivado. De hecho, voy a intentar volver en algún momento para seguir haciendo team building porque me parece genial. Me gusta la estructura remota, pero siento que a veces necesitamos un poco de contacto con nuestros compañeros" - Mariano, DevOps Engineer Teracloud. Victoria Vélez Funes SEO - SEM Specialist teracloud.io #Teracloud #TeraTeam #TeamBuilding #Teraweek #Teraclouders #DevOps
- AWS S3 with CloudFront, high-performance security
Services Amazon CloudFront is a content delivery network (CDN) service built for high-performance security, and developer convenience can be used as the frontend of many services (S3 Buckets, ELB’s, media distribution, and any other HTTP server running in ec2 instance or any other kind of host). Besides, Cloudfront uses edge locations to cache copies of the content that it serves, so the content is closer to users and can be delivered to them faster. Edge locations are AWS data centers designed to deliver services with the lowest latency possible. Amazon has dozens of these data centers spread across the world. They’re closer to users than Regions or Availability Zones, often in major cities, so responses can be fast and snappy. Then, we will use Cloudfront to deliver access from Edge Locations to S3 Buckets, which can be used for static websites or, in this case, as file server storage. Resources First, we begin creating a Terraform infrastructure, which contains the next resources: AWS CloudFront Distribution as Frontend access AWS S3 Bucket for Storage AWS Route 53 Zone for records SSL Certificate IAM user/group: Credentials for users IAM Policies: Access and manage of Buckets After applying this plan we can view the below resources in AWS Console: Cloudfront Distribution AWS S3 Bucket Note: Remember to set Bucket as private and manage the access through Cloudfront. Cloudfront OAI We need to configure OAI because we want Cloudfront to access S3 private buckets. And finally, is necessary to attach OAI Policies to CDN (S3 Origin config). Route 53 In this case, we suppose that Route53 is already created, so we will use Data Sources from Terraform to retrieve the resource. SSL Certificate After, if you don’t have a certificate ssl for this record domain or a wildcard ssl for a whole domain, you can create and deploy a free certificate in ACM: AWS Certificate Manager - Amazon Web Services (AWS) IAM We need to create a group with policies to allow access and list buckets, then add users to this group. Policies: Desktop GUI - S3 Access Well, we have a Cloudfront Distribution which is used for access to S3 Resources and upload/download files, but we don’t want users to login into our AWS Platform and navigate through to S3, doesn't seem to be a good practice. Then, we will use the Cyberduck client, a desktop platform that connects to any Amazon S3 Storage region with support for large file uploads. If you have in AWS US you need to select the next profile: Note: If you use other AWS platforms (like GOV or China) you can download the right profile from the Cyberduck official webpage. Finally, we can connect with the user access keys, navigate and open our buckets, and upload or download files within. References Cloudfront + S3 example: https://github.com/teracloud-io/cfs3-blog Terraform resources: https://registry.terraform.io/providers/hashicorp/aws/latest/docs Cyberduck Client: https://cyberduck.io/ If you are interested in learning more about our TeraTips or our blog's content, we invite you to see all the content entries that we have created for you and your needs. Nicolas Balmaceda DevOps Engineer teracloud.io #Teracloud #TeraTips #aws #awslatam #DevOps #Cloudfront #AmazonS3 #S3 #secuirty #EC2 #EdgeLocations #Terraform #cloud #ssl
- Instance interactive access
Generally, we create Linux instances allowing port 22 to access via SSH. Using AWS Systems Manager Session Manager instead of directly accessing via SSH we don't need inbound rules to open ports in Security Groups. There are no inbound rules. Normally we’d require tcp 22 to ssh into this instance. If we go back to Systems Manager, we can go to instance actions, start a session, and..what’s gonna happen? We are connected to our instance! We strongly recommend using AWS Systems Manager Session Manager to manage instances. Also, it allows MFA, and it provides command history auditing. Like what you read? You may also be interested in reading Using SSM Parameter Store Follow us on our social networks and find out about all the news from the cloud world and you will be able to find more Teratips! Mariano Logarzo DevOps Engineer Teracloud If you want to know more about our services, tips, blogs, or a free assessment email our team member firstname.lastname@example.org #Teracloud #aws #TeraTips #SSM #cloudsecurity #AWSSecretsManager #SSMdocuments #systemmanager #instance #SSH #MFA
- Technological Challenges for Startups in 2022
A couple of years ago talking about Covid-19 was something new for all of us, today it is already daily bread so much that it sounds like a cliché, but as humans, we have overcome the arrival of this virus in a large percentage and in the same way they have made many companies that were affected and had to find new ways to sell their products and services. As 2022 begins, there is more attention than ever on early-stage startups as they have played a major role in driving economic recovery globally. The success of a promising early-stage startup, of course, depends on many things, including personalities, outside forces (such as pandemics), and good old-fashioned luck. But the best young startups already have a clear vision of a pressing market need and clear beginnings of a compelling way to address it. The impact of digital transformation, the reinvention of professional development and new ways of working, the use of Big Data and Artificial Intelligence in the work environment, the blockchain, new digital spaces such as the metaverse, are some of the new trends and challenges that startups face; without neglecting how technology and its relationship with sustainability open up a new world of opportunities for companies. According to different studies, at least 40% of companies will disappear in the next 10 years if they do not adapt to new technologies, so this is the opportunity for the different startups to anticipate the transformation that society is experiencing, led by the need to overcome the incredible challenges that you have had to face at this time: from security and privacy to productivity, through the role of technology or the limits between the real and virtual world. Paul Graham, a renowned venture capital investor, points out that the ideal growth rate for a startup would be 5% to 7% per week. So much so that startups work with greater agility in decision-making and process execution. This factor leads us to a preliminary conclusion: in a startup, whatever its domain, time is the most limited and richest resource, considering then that the main challenge is to build agile methodologies that accompany the growth process of startups and allow efficient analytics, both growth, and consumer, for any type of company. At this point, according to a survey conducted by the open-source company RedHat between June and August 2021 of 1,300 professionals from around the world, companies have started their digital transformation project, mainly because their growth demands different challenges that require this evolution in technological terms. In fact, 31% of those surveyed said they are in this process and 22% have accelerated their efforts to do so, while 8% said they are in the early stages of starting new projects, which further highlights this change. Startups that currently solve universal problems with technology as an ally capture the attention of capital worldwide and much more so those that focus on data science. Since they have the differential possibility of managing the flow of users, knowing the growth of each user segment, understanding their behavior on digital platforms and developing the algorithms that each item needs to optimize their advertising schedule, thus creating personalized loyalty strategies. for each user segment. 2022 will be a year of business opportunities and great challenges for many, as long as companies care about highlighting the shopping experience, personalized services, listening to disagreements, and meeting the needs of their customers as if they were their own and above all and as we already mentioned digitization and innovation earlier. The future will be technological and digital, both digitization and the cloud play an essential role in competitiveness, adaptation to the market, and success in business. In this way, companies can spend time innovating and improving their competitiveness. Work with Teracloud in order to modernize your business and be sure to count on running workloads, great serverless operational models, agile development processes, and 24/7 personal support. Give your company significant business opportunities. Start and Grow saving money, moving faster, and integrating on-premises businesses and data in your own organization. Ready to start? It's time to operate differently! Liliana Medina Social Media and Digital Content Manager Teracloud #Teracloud #aws #awalatam #covid19 #technologicalchallenges #Terablog #startups #datascience #DevOps #cloud #cloudcomputing #digitization #innovation #security #AI
- AWS Dry-Run
Sometimes, we want to know if we have the necessary permissions to execute a certain command, but we don’t really want to execute it! And let’s say that for some reason we can’t access IAM and check our permissions For this, we have --dry-run, a parameter that is applicable to some commands that we can use in our AWS CLI. Suppose we want to know if our user has the necessary permissions to run instances! we have a test user, teratip. A correct example of run-instances would be something like this: aws ec2 run-instances --image-id ami-02edf5731752693cc --instance-type t2.micro But we don't want to run this, because if we have the correct permissions the instance will be created, and our billing will be affected! So let’s just add --dry run to the command and see what happens aws ec2 run-instances --dry-run --image-id ami-02edf5731752693cc --instance-type t2.micro The message indicates that an error occurred when calling the RunInstances operation and the request would have been successful but we have specified the --dry-run flag This is the correct behavior when we have the proper permissions for the operation. Now, using a user that does have access to IAM, let’s check the policy assigned to the Teratip user: As we can see, the Teratip user has the correct permissions to execute the ec2 run-instances command. Now, let’s do a test, and remove the assigned policy from the user, and see what happens when we execute the command: As we can see, then the operation is executed with --dry-run, and we don’t have the necessary permissions, we get a long error like this. This is a great way to test our API calls and without affecting your billing. Rodrigo González DevOps Engineer Teracloud #Teracloud #TeraTips #aws #awslatam #DevOps #dryrun #EC2 #AWSCLI #learmore #cloudcomputing Follow us on our social networks for more TeraTips
- What are we talking about when we talk about service delivery?
How many times a day do we see search posts for Service Delivery Managers? How many people have this experience described in their professional profiles? But, do we know what it is about? Do we know what areas it covers? And what knowledge should Service Delivery Manager have? If we search the web, we will see that Service Delivery is a framework that encompasses a set of principles, standards, policies, and restrictions that will be used to guide designs, development, implementation, operation, and withdrawal of services provided by a specific company or provider. These tasks are developed by a Service Delivery Manager who, depending on the industry and/or company in which they work, will fulfill these or different types of tasks. SDMs play an important role in any organization as they are in charge of keeping customers happy with the services of the company they work for. And you may ask yourself: how do we keep a customer happy? The SDM is responsible for the results of a project for the client. They must maintain fluid contact and provide the status of the progress of the project. That is why many times the Service Delivery Manager acts at the same time as the Project Manager. Within the responsibilities of a Service Delivery Manager, we can find the management of SLA, KPI's, contract negotiation, improvement proposals, etc., and be an interlocutor between customer and supplier. Why is Service Delivery Manager considered an added value? Centralizing information is essential to carry out a successful project. And if we talk about communication, the Service Delivery Manager is an essential piece for the management of a project, acting as a facilitator between client and supplier. The Service Delivery Manager will be in charge of maintaining fluid communication with as many clients as projects in charge. Through reports providing visibility, the client will be able to see the progress of their project on a daily basis without taking time away from IT specialists. As manager of each project, the SDM will be a fundamental collaborator for the technical team, analyzing the needs of the clients and translating them into deliverables that will be monitored daily to facilitate communication later. Personalized treatment exponentially improves the customer experience and internal coordination with those who will be in charge of providing the service. The focus will be on improving business results and customer satisfaction, meeting both expectations. What are Service Delivery Manager’s responsibilities? SDM tasks may vary or differ depending on the type of company they work for. In the case of the SDM of an IT company, tasks may include: Be a communicator and facilitator between client and service provider. Lead work team defining the scope and needs of the project together with the client and transfer them into a plan to the internal team Guarantee established processes Agree on objectives, deliveries (SLAs), and continuous improvement projects with the client Conflict management and negotiations Project tracking Give visibility of project progress to your clients Prepare reports Maintain contact after the delivery of the project in case the client has any unforeseen need or propose improvements Measure the performance and quality of the service internally Manage the overall well-being of their team Cost-related tasks, such as managing cost reduction without sacrificing customer satisfaction Obtain and evaluate customer feedback to improve services What skills should a Service Delivery Manager have? This is a limited list of the many responsibilities that the service delivery manager's task entails: Ensure the quality of the technical service and any other level Manage resources. The SDM must efficiently organize the time and work of the Talent assigned to each project. Works closely with the DevOps Team, Owners, and Clients. The definition of roadmaps and business requirements is important. understand the challenges of migrating your on-premises infrastructure and applications to the cloud, and you can close the gaps identified in the cloud solution strategy design process. Accompany Clients with any type of problem that may arise, managing the financial aspects of the contract. Ensure that contracts are carried out in accordance with the agreed terms. Promote a good work environment. Essential for the Team to perform its daily tasks efficiently. Teracloud commitment Skills improve with experience; striving to get to know customers, generate pleasant and positive relationships with them, and maintain ethical and fluid communication, distinguish the customer service experience. Since the Service Delivery Manager is the main point of contact with customers, delivering an outstanding customer experience is essential to maintaining a healthy, trustworthy, and efficient relationship. And this is our commitment: Provide the better Customer Experience to keep clients happy, maintain a relationship that lasts over time, generate trust for clients to deliver their projects to us, knowing that Teracloud will act in a reliable, safe, responsible, and professional partner. We work to provide solutions that allow our clients not only to meet all these needs but also to adapt with sufficient agility to changing demands; that is why we offer services that allow unifying all processes in the cloud, simplifying internal work, and thus achieving the appropriate follow-up in a much faster and more flexible way. We are a group that is constantly transforming and growing, hand in hand with flexibility and success. Start your journey to the cloud. How can we help you? Leave us your comments or fill out the form on our website. https://www.teracloud.io/services-cloud-management Carolina Guerrero Service Delivery Manager Teracloud #Teracloud #TeraBlog #aws #awslatam #DevOps #SDM #customerexperience #customerservice #ServiceDeliveryManager Follow us on our social networks for more TeraTips
- Easy CodeCommit authentication with git-remote-codecommit
There is a new and much easier way to interact with CodeCommit repositories from git. Forget the times when you need to run git config for setting helper scripts for authentication and enter the XXI century with git-remote-codecommit. In a nutshell, this new git helper gives you two features: Authenticate using your aws cli credentials automatically Use a new codecommit:// protocol that simplifies the url and naming of the repositories. How to install it Install the helper with pip: pip install --user git-remote-codecommit How to use it Once you have installed the codecommit helper, you may clone any repository using the following syntax: git clone codecommit://$PROFILE@$REPOSITORY Where: $PROFILE is the name of the profile for the repository in your aws config. $REPOSITORY is the repository name in CodeCommit. Conclusion Using the new helper allows you to be up and running with new repositories in no time, and saves you the work of maintaining credentials and configuration files. Carlos Barroso Senior MLOps Engineer teracloud.io #Teracloud #TeraTips #aws #awslatam #DevOps #Codecommit #Github #code #security #awscli Follow us on our social networks for more TeraTips
- How to secure your business? Is your data secure?
One of the main concerns of a company when deciding to move part or all of its computing and data management resources to a cloud computing service is security. That is why data protection is the first step for an adequate cloud security strategy. Due to the pandemic, e-commerce, online transactions, and the use of digital services increased dramatically, but as companies have adapted to the digital world, thieves also find new ways to commit criminal acts and increase digital attacks. From ransomware, phishing, hacking to data breaches and insider threats, businesses are now exposed to a greater number of vulnerabilities that are increasing tremendously fast and becoming even more dangerous every day. On top of that, cryptocurrencies provide certain anonymity simplifying the job for them. Technologies that can facilitate the illegal trafficking of drugs, weapons, and explosives, human trafficking, money laundering, terrorist activities, and cybercrime. As a consequence, business owners lose money, data governance, and customer data, but more importantly, the trust of their users and buyers. Awareness is an important step! A survey by The Pearson Institute and The Associated Press-NORC Center for Public Affairs Research shows that “about 9 in 10 Americans are at least somewhat concerned about hacking involving their personal information, financial institutions, government agencies, or certain public services. About two-thirds say they are very or extremely concerned". CyberEdge Group Cyber Threat Defense Report 2021 In this way, cloud security refers to the practice of protecting the integrity of cloud-based applications, data, and virtual infrastructure because cyber attackers can exploit security vulnerabilities, using stolen credentials or compromised applications to carry out attacks, interrupt services or steal confidential data. That is why today it’s extremely necessary to reinforce security measures and apply best practices to ensure business continuity. One of the best solutions is to automate the security of your operation. With security automation, you can develop an environment that is designed to address today's most important threats, as well as increasingly stringent compliance requirements. With this knowledge, design a protection strategy that, periodically, generates backup copies of the data, its storage in a safe place, and the access protocols to those copies once stored. In addition to the peace of mind that comes from having specialists who know how to protect company information, having consistent, scheduled backups is a clear way to save money, because it saves the work and storage of those backups. It also allows better compliance with the Organic Law on Data Protection, which requires ensuring correct storage of user and customer information. And finally, having experts managing the data guarantees easy and simple access to backups. Cloud computing is increasingly the norm rather than the exception for businesses. Put security first and your organization will be able to drive top business benefits while significantly reducing cyber risk. Finally, we want to share with you these tips from an IT point of view. Implement Smart Scanning for Vulnerabilities Security by Design, you can get more information in the link (https://aws.amazon.com/es/compliance/security-by-design/) Look beyond SSL Always encrypt Data Limit Access to Sensitive Information Firewall Keep Compliance on track Do you want to know if your company is at risk? Take action and let us review your infrastructure security based on industry best practices. Schedule your 30 min call here. In Teracloud we are specialists in protecting information and ensuring that your business has the necessary tools so that you can offer products and services in complete safety. Liliana Medina Social Media and Digital Content Manager Teracloud If you want to know more about our services, email our team member email@example.com #Teracloud #aws #AWSLatam #TeraBlog #Data #cloudcompanies #datacompanies #digitaltransformation #automate #cybersecurity #pishing #hackers #ransomware #cryptocurrencies #cyberattacks #security
- CyberSecurity Month: Tips to avoid being phished
October is Cybersecurity Awareness Month. And just today, Twitch has been breached, badly. How does this relate to phishing, you wonder? Well, 91% of cyberattacks start with Phishing, that’s how. Phishing is a type of social engineering where an attacker sends a fraudulent message designed to trick a victim into revealing sensitive information to the attacker or to deploy malicious software on the victim's infrastructure like ransomware. Bad actors continually capitalize on widespread fear and uncertainty, and you are your first line of defense. These are red flags that could indicate you are about to be phished by email: Is there a sense of urgency? Does it try to make you do something fast? Is this email expected? Does it contain an attachment? Does it talk about some “error” or “due date”? Is the email tagged with EXTERNAL EMAIL? Is the FROM in the email different from the actual email address? (Just hover the name) If it sounds too good to be true, then it is. Think twice before clicking on a link. Hover it and check that the domain is legitimate. Bad guys usually remove or replace a letter in URLs, for example, “AMAZ0N.COM”. See what I did there? Does the email contain an attachment? Use these red flags and your gut feeling, and think twice before clicking on a link. Your company and your bank account will thank you later. Carlos Barroso Senior MLOps Engineer teracloud.io #Teracloud #TeraTips #aws #awslatam #DevOps #Cybersecurity #Phishing #Cybersecuritymonth Follow us on our social networks for more TeraTips
- How do companies benefit when machines learn?
Machine Learning or ML is a branch of Artificial Intelligence (AI) that mainly consists of mathematical and statistical models that help machines to learn based on data and is currently being the development engine of many companies. For some years now, organizations have been interested in developing image recognition, shape recognition, understanding of natural language -that studies the interactions between computers and human language- among others, in order to improve their processes and they successfully achieve. With the aim of facilitating the path to digital transformation that companies travel today. In this way, many processes are automated without even needing human intervention and there are many advantages that this technology provides. If you want to know how it can help you in your professional routine, keep reading! Some examples of companies recognized worldwide for successfully using machine learning are Amazon, Netflix, or Spotify. Spotify uses a combination of different methods of data capture and segmentation to create its recommendation model, which is the flagship of the platform, called Discover Weekly. On the other hand, Spotify provides its users with a list of songs and songs that they have not heard before, but that according to their history they believe will attract the attention of each user. An important fact, the list is totally personalized. . It’s important to note that Machine Learning is not an exclusive strategy only for large companies. One of the greatest difficulties for small or medium-sized companies is in analyzing and drawing conclusions from the large amount of information they collect from their users and potential customers, which allows companies to optimize their manufacturing processes, operations, and potential customers at a general level. improving its internal efficiency. Advantages of applying Machine Learning in the company Better customer service Increase Sales Improve customer engagement Decrease in errors Preventive actions Cybersecurity Fraud detection Process automation Improved decision making at both production and business levels Thanks to this technology, machines also perfect their tasks by increasing their quantity and quality. Implementing processes associated with Machine Learning is a great step in the digital transformation of companies. For this, it’s important to have a stable and reliable network that supports the correct functioning of the processes associated with the digital transformation of the company. The data revolution is here and using data to run and transform your business is the new standard. Getting value out of your data is a difficult, skill-intensive process, and doing it efficiently and quickly requires the right people and the right tools. Drawing on our experience with AI projects, we designed a set of processes and tools to help you get more value from your data models, get it sooner, and continue to get it over time. Our MLOps service can be deployed in small increments, which composes over time. We typically start by automating the two ends of the Machine Learning Development Lifecycle: Data ingestion and model deployment. First, we ensure your data teams are getting a clean and steady stream of data to produce the optimized models your business requires. Next, we automate the deployment and observability of these models to make better use of your Data Scientists’ time and abilities. Beyond that, the sky is the limit! We add value along with the whole Machine Learning Development Lifecycle with tools like Sagemaker experiments so you can always retrace your steps and reproduce any model you did in the past. Another hot tool is Amazon Clarify, which finds biases in your trained models and in your input data, so you can eliminate them sooner and speed up the process. In the businesses of the future, there will be more and more talk about Machine Learning. Are you ready to venture into the world of Machine Learning? Liliana Medina Social Media and Digital Content Manager Teracloud If you want to know more about our services, email our team member firstname.lastname@example.org #Teracloud #aws #AWSLatam #TeraBlog #ML #AI #Data #cloudcompanies #datacompanies #digitaltransformation #automate #MachineLearning #MLOps #artificialintelligence
- Data fundamentals on AWS: Data Pipelines
As our world becomes more and more data-oriented, we need good conceptual frameworks for communicating our ideas and for making the best design decisions. On one hand, this blog post describes a mental model of a data processing pipeline with two main goals: to establish common language and concepts and to help us to design data pipeline architectures following the best practices. From a practical standpoint, this framework will guide you in designing and evaluating any data pipeline, from a simple batch processing process to a petabyte-scale stream processing architecture. We will develop a real example along with our explanation for didactical purposes. Use case As with anything we want to learn, a good, practical example is the way to go. We describe a simple requirement for a data pipeline and we will develop the concepts along with the implementation: I want to upload a CSV file to a given S3 bucket (Capture stage). I want an automatic calculation to be performed on the input data. (Ingest stage). I want to store and build a new CSV file with the result of the said calculation and each ID of the original CSV (Transform stage). I want to store the new CSV file on an S3 bucket. (Store stage). I want to be able to consume the CSV using http. (Profit stage). These steps are chained together, and the first one triggers all the others. 5 stages of a data pipeline To simplify our understanding of the data management pipelines we define 5 different, consecutive stages, going from raw data to getting actual value from it. DISCLAIMER: This is a humongous simplification for educational purposes, please do not take this as a literal implementation guide. We can think of these steps as sequential, each one producing the input of the next: Capture stage The pipeline life starts as soon as the data is produced. We want to have some artifact sitting right next to the data production process to capture it and send it to the ingestion process of our data pipeline. At this stage, you face challenges like not losing any data, how to transmit it in an efficient way, how to secure the data, how to authenticate the data, etc Ingest The ingestion process bridges the data between the producing agent and the location where the data will continue down the pipeline. It receives the data from the Capture process and optionally sort, clean, enrich, discards or relay some of all the data. It can receive data in batches (like files for example) or as a stream. As you work more and more with data pipelines you discover that all these stages can be combined, reordered, or skipped completely in the name of efficiency, security, or simple operational convenience. Transform This process transforms the data for its different uses. You may transform the data for using it in analytics, and also convert it to some optimized format for storage for example. Normally here is where you do the most expensive data sanitization and transformation because the data has already been cleaned (if done at Ingestion), and also you have more computing power available to run it. Storage When the data is ready, it is stored in a durable medium for consumption of the value-generating processes. The data can be copied multiple times and sent to different storage mediums according to its expected usages, like a data lake for long-term storage, a data warehouse for analytical processing, or even SQL and no-SQL databases. Profiting from data In this step, you start getting value from your data. From simple analytics to the most sophisticated deep learning techniques, the possibilities are endless. Moving forward There are many different variations of these paths, like repeating or looping steps, additional processes, and more. We will explain these variations with real-world examples in the next posts of this series. The most important part to you, who is trying to profit from your data, is the last step. Our DataOps teams at Teracloud can take care of all the previous steps with the utmost efficiency, quality, and security. Carlos Barroso Senior MLOps Engineer teracloud.io #Teracloud #TeraTips #aws #awslatam #DevOps #DataPipelines #DataOps #dataarchitecture Follow us on our social networks for more TeraTips
- Introducing ec2-instance-selector
When provisioning a spot fleet, getting the correct instance type can be a little bit overwhelming, log in to the console, look for instance capacity, etc That's why AWS provides us a really useful tool from the cli, ec2-instance-selector (https://github.com/aws/amazon-ec2-instance-selector) This makes our lives much easier. You just need to run the command passing resource criteria like vcpus, network performance, ram, etc. Be sure to have your AWS credentials loaded when you run it. ➜ ec2-instance-selector --vcpus=2 --memory=4 --cpu-architecture=x86_64 --gpus=0 c5.large c5a.large c5ad.large c5d.large t2.medium t3.medium t3a.medium There are lots of filters to use, check it out with --help: ➜ ec2-instance-selector --help (...) Usage: ec2-instance-selector [flags] Examples: ec2-instance-selector --vcpus 4 --region us-east-2 --availability-zones us-east-2b ec2-instance-selector --memory-min 4 --memory-max 8 --vcpus-min 4 --vcpus-max 8 --region us-east-2 Filter Flags: --allow-list string List of allowed instance types to select from w/ regex syntax (Example: m[3-5]\.*) -z, --availability-zones strings Availability zones or zone ids to check EC2 capacity offered in specific AZs --baremetal Bare Metal instance types (.metal instances) -b, --burst-support Burstable instance types -a, --cpu-architecture string CPU architecture [x86_64/amd64, i386, or arm64] --current-generation Current generation instance types (explicitly set this to false to not return current generation instance types) --deny-list string List of instance types which should be excluded w/ regex syntax (Example: m[1-2]\.*) -e, --ena-support Instance types where ENA is supported or required -f, --fpga-support FPGA instance types (...) So stop wasting time looking for instance types and give this powerful tool a shot! Leandro Mansilla DevOps Engineer Teracloud If you want to know more about our services, tips, blogs, or a free assessment email our team member email@example.com #Teracloud #aws #EC2 #Instance #Github #DevOps #cli