Quick Access to my selection Last Update: 2020-04-06 18:30 IST Unix epoch: ‭1586194224‬
Web ProgrammingOperations Engineering/DevOps/SREEngineeringMiscellaneous
Back End ProgrammingCommodity HardwareZFS

Troubleshooting a shell prompt irresponsible

You do df -h or ls / and the terminal freezes and not even CTRL + C works, you have a lock.

Normally this is due to a lock of the system trying to perform an IO.

Could be a physical spinning disk failing, but the most probably nowadays is that you have a network mount point and it is timing out.

If you execute mount and you get a timeout, and when you finally see the list you see a NFS, iSCSI or another kind of Network mount (you will see an Ip Address), check for errors.

To do this in CentOS/RHEL you can do as root:

dmesg | grep -i "timed"

or depending on the System

cat /var/log/messages | grep -i "timed"

You’ll get something like this:

[root@compute01 carles]# dmesg -T | grep timed | head -n5
[Fri Mar 20 02:27:44 2020] nfs: server storage07 not responding, timed out
[Fri Mar 20 02:27:44 2020] nfs: server storage07 not responding, timed out
[Fri Mar 20 02:27:44 2020] nfs: server storage07 not responding, timed out
[Fri Mar 20 02:27:44 2020] nfs: server storage07 not responding, timed out
[Fri Mar 20 02:27:45 2020] nfs: server storage07 not responding, timed out

Please note I use dmesg -T in order to have human readable date instead of Unix Epoch.

You can count the errors today:

[root@compute01 carles]# dmesg -T | grep time | grep "Mon Apr 6" | wc --lines
3123

I cancelled my Amazon Prime subscription

I was using a lot Amazon. Sending parcels to my previous job offices, and now to Blizzard offices, so I subscribed to Amazon Prime. With COVID-19 virus we were sent to do Remote Work, and now with the lock down basically I’m 99.99% of the time at home.

I did a test to see how it works sending to home during the pandemic.

I choose two different items, I reviews the order, they were going to be delivered separately, one day of distance.

I choose two items that will fit in my mailbox, separated or together. One USB3 3mts male female and a Blu-ray movie.

My surprise comes when I go to the mailbox one day before and I see that I have a paper from an-post telling that they pass by to deliver my parcel, and they did not leave because it doesn’t fit the mailbox and they did not want to leave it a common space. For my surprise both Amazon parcels were grouped and sent before time. Maybe in a bigger box. But the mailman did not ring my door.

The paper tells me to get my parcel in the middle of the city, during the lock down. No way! I’m not going to risk my health and specially from elders, just to grab a cable and a movie.

I had the chance to request re-delivery to an Post, so I do. I fill all the info, I inform my phone number, email, I indicate which door to ring, and two days after as promised… a paper from an Post!.

They did not even rang my bell again.

I go to Amazon to cancel the order, but the process is only created for if you got the items.

Fuck it. I’m not going to order anything else to Amazon until that COVID-19 passes.

I don’t know if the postman just avoids people for fear to contagion or the process of an Post is awful and he didn’t get any information. But I’ll not buy anything even if I cannot buy in other places cause the lock down.

I was going to maintain my Amazon Prime subscription, even if I know that I’ll not use it much with the lock down, but makes no sense. Also:

  • I use Netflix and my Raspberry Pi 4, I was not using Amazon Prime Video.
  • I use Spotify, I was not using Amazon Prime Music.
  • I like to read in paper, not in eBook, so I was not using the eReader options.

A nice way to loss a customer.

Datacenters, D&R and coronavirus

I’ve been working for years within Data centers, with D&R strategies, and then in the middle of COVID-19, with huge demands on increments of bandwidth and compute, some DCs decided to do not allow in the Engineers of their customers.

As somebody that had my own Startup and CSP and had infrastructure in DCs and servers from customers in colocation, and has replaced Hw components at 1AM, replaced drives from broken RAIDs, and fixed systems so many times inside so many Datacenters across the world, I’m shocked about that.

I understand health reasons can be argued, but I still have Servers in Datacenters because we all believed they were the most safe place, prepared for disaster and recovery, with security, 24×7… and now, one realise that cannot enter to fix or upgrade the own machines.
Please note, still you can use the remote hands from the DC, although this is not a good idea many times, I’m not sure this will still be an available option when the lock down in those countries becomes more strict.

I’m wondering if DCs current model have any future at all.

I think most of the D&R strategies from now will be in the cloud, in different regions, with different providers, so companies can resist providers or governments letting them down.

Media Player in my Raspberry Pi 4

Just installed a media player in my Raspberry Pi 4

So I mentioned it was one of my pending tasks, to do while I’m confined here, at home, to help the Irish government to stop the quick spread of the coronavirus.

I’m happy that the situation in Ireland has stabilized, unlikely in Spain, where that historical lack of discipline and selfishness and super ego to believe Madrid the capital of the world, and so deciding not to close it for quarantine, will cause a lot of pain. I hope the closing of frontiers in Catalonia works.

Well, what I do you’re probably asking yourself, so I installed LibreELEC https://libreelec.tv/.

They have a very nice SD image writer for Linux, Mac and Windows, that will install the proper image on the micro-SD for your ARM device.

This Raspberry Pi 4 comes with Wifi integrated and a Gigabit Ethernet network port.

When I was in Barcelona, I had Kodi with Raspberry pi 2 and version 3.

This model v. 4 is much more cooler. I bought the 4GB version, and has 2xHDMI 4K.

So it is great to connect to any modern TV.

In Barcelona, I have Linux tower as NFS Server sharing my files with the Pi. Work good, even for the 100Mbit NIC of the version 3, but at that time I was only playing Full HD as the Pi didn’t supported greater resolution, and I only had that resolution on my displays too.

For now, I’m going to explore how is reading from a USB 3.0. Let’s see if it’s able to play smoothly.

The cool thing also is that I have SSH access, and so I can use the Pi for many more things. :)

I have my first update, I noticed that copying to that USB was not the best for me, as I tried to copy a .MKV file of 4.9GB and I encountered the limit of 4GB of FAT32. I could format the USB as ext4, but what I did is, SSH into the box, I see that I have two partitions on the SD for booting the Pi, the second one is a ext4 called storage. So I copied to the SD, through the network, using sftp the file I wanted.

The Gigabit connection was fast, but when the buffer fulled it started to show the real speed of the SD which is 15MB/s for writing.

Ext4 has no problem in holding a file 4.9GB so I’m watching my movie now. Will think about setting a NFS for the Pi as it will be very convenient. :)

I have an external, remote, keyboard logitech, but it happens that LibreELEC recognizes my Sony command, from the television. I don’t need the keyboard/mouse. Nice.

Here you can see my Raspberry Pi 4, connected to TV, in “combat mode”, naked, as PoC, before setting in its definitive place behind the TV.

Playing from the external USB 3.0 stick was also fluid, allowing 4K perfectly.

The only problem I has was when I was pushing movies to the USB through the network, and playing at the same time from the SD. It seems like the Raspberry reached its limits doing this and playing stuck frequently.

Remote working is here

So remote working is here.

After years in which many Engineers requested to the companies to be able to Remote Work, with most of answers No, now it happens that not only is good for the company, is the only way to ensure continuity of business, of many businesses.

One of my colleagues from Denmark, which government has shutdown the country by sending all the public servants to home, in order to prevent the spread of the coronavirus, told me:

“Yes, remote working is here, but has been necessary the four horsemen of the apocalypse”

It is curious, how Remote Working has arrived, no thanks to that was obvious, but due to external emergencies. And I’m glad that my company was prepared for business continuity.

I’ll be staying home, working remotely, in order to contribute to non-spreading the virus, specially among old people. I’m perfectly healthy but that’s a use case, many people will not develop the symptoms and still be able to spread to others.

So I have some plans related to technology to do at home, including few improvements to the blog. What are your plans?.

Update: 2020-03-13 23:16 UTC I’m thinking in all those business which are forces to close, and all the employees that will not get a salary, or will be fired, or will get a salary and the business owner maybe ends in bankrupt as is paying the salaries and no income is being generated.

Update: 2020-03-19 10:58 UTC Some of my friends, even in Human Resources/Recruiting, are starting to remote work for first time. So here is some advice:

I would recommend to get an external monitor, at least 22″, so you neck is not forcing position looking low and your eyes don’t suffer, good light (don’t in dark), a nespresso can be a good friend in the morning, and to have your hands and arms aligned correctly so you don’t suffer from a bad position. Watch the position of the wrists, your arms should be comfortably at the same level than the table, similar in an L, and your eyes be aligned to the top of your monitor. Finally I would recommend to follow a routine, like if you were going to work, so dress like you would do. Don’t stay at home all day in pijamas! ;)

News for the blog, upwards and onwards

2020-03-06 Heya, I’m doing a set of improvements to the blog.

One, you can already see. I added a new section to the CSS @media, so now screens bigger than 1,800 px in width, will use that width for rendering the page. The original WordPress theme at 960x was too small for our current screens. I will add a new CSS @media for 4K screens promptly.

Other is about the organization of the content. I want to separate a bit the contents, now articles are sequential and is difficult to discover nice contents if they have 2 or more articles more recent, so I will group articles by content and provide a small index on the top page. Also I will provide more areas for Operations, SRE, where it will be easy to locate code, scripts, tricks… things that are useful to our day to day. I also want to make visible the articles about living in different cities, for IT Engineers, with useful tricks and tips. And keep the more complex and more interesting Engineering matters in the main page.

Updates

2020-03-13 15:49 Added SSL to the blog

With more delay I wanted, I bought a SSL certificate, configured Apache, and after few changes to the blog has been set. One very annoying is that WordPress linked the images statically pointing to http://blog.carlesmateo.com so I changed the latest article’s images to point to relative path so they will work nice with http or https.

My reflection is that everything negative can have its positive output. With this coronavirus thing, I decided to focus into improving things. And so I’m doing. :)

CTOP

CTOP is a tool for Linux System Administration that I’ve written in Python3, that uses only the System (/proc), and not third party libraries, in order to get all the information required.

The purpose of this tool is to help to identify problems and troubleshot with a single view to a single tool that has all the typical indicators.

It provides in a single view information that is typically provided by many programs:

  • top, htop for the CPU usage, process list, memory usage
  • meminfo
  • cpuinfo
  • hostname
  • uptime
  • df to see the free space in / and the free inodes
  • iftop to see real-time bandwidth usage
  • ip addr list to see the main Ip for the interfaces
  • netstat or lsof to see the list of listening TCP Ports
  • uname -a to see the Kernel version

Other cool things it does is:

  • Identifying if you’re inside an Amazon VM, Virtual Box, Docker or lxc
  • Uses colors, and marks in yellow the warnings and in red the errors, problems like few disk space reaming or high CPU usage according to the available cores and CPUs.
  • Redraws the screen and adjust to the size of the Terminal, bigger terminal displays more information
  • It doesn’t use external libraries, and does not escape to shell. It reads everything from /proc /sys or /etc files.
  • Identifies the Linux distribution
  • Shows the most repeated binaries, so you can identify DDoS attacks (like having 5,000 apache instances where you have normally 500 or many instances of Python)
  • Indicates if an interface has the cable connected or disconnected
  • Shows the Speed of the Network Connection (useful for Mellanox cards than can operate and 200Gbit/sec, 100, 50, 40, 25, 10…)
  • It displays the local time and the Linux Epoch Time, which is universal (very useful for logs and to detect when there was an issue, for example if your system restarted, your SSH Session would keep latest Epoch captured)

Limitations:

  • It only works for Linux, not for Mac or for Windows. Although the idea is to help with Server’s Linux Administration and Troubleshot
  • The list of process of the System is read every 30 seconds, to avoid adding much overhead on the System

I decided to code name the version 0.7 as “Catalan Republic” to support the dreams and hopes and democratic requests of the Catalans people to become and independent republic.

I created this tool as Open Source and if you want to help I need people to test under different versions of:

  • RedHat (I have no longer commercial licenses)
  • Atypical Linux distributions

If you are a Cloud Provider and want me to implement the detection of your VMs, so the tool knows that is a instance of the Amazon, Google, Azure, Cloudsigma, Digital Ocean… contact me through my LinkedIn.

Monitoring an Amazon Instance, take a look at the amount of traffic sent and received

Some of the features I’m working on are parsing the logs checking for errors, kernel panics, processed killed due to lack of memory, iscsi disconnects, nfs errors, checking the logs of mysql and Oracle databases to locate errors

Resources for Microservices and Business Domain Solutions for the Cloud Architect / Microservices Architect

First you have to understand that Java and PHP are worlds completely different.

In PHP you’ll use a Frameworks like Laravel, or Symfony, or Catalonia Framework (my Framework) :) and a repo or many (as the idea is that the change in one microservice cannot break another it is recommended to have one git repo per Service) and split the requests with the API Gateway and Filters (so /billing/ goes to the right path in the right Server, is like rewriting URLs). You’ll rely in Software to split your microservices. Usually you’ll use Docker, but you have to add a Web Server and any other tools, as the source code is not packet with a Web Server and other Dependencies like it is in Java Spring Boot.

In Java you’ll use Spring Cloud and Spring Boot, and every Service will be auto-contained in its own JAR file, that includes Apache Tomcat and all other Dependencies and normally running inside a Docker. Tcp/Ip listening port will be set at start via command line, or through environment. You’ll have many git repositories, one per each Service.

Using many repos, one per Service, also allows to deploy only that repository and to have better security, with independent deployment tokens.

It is not unlikely that you’ll use one language for some of your Services and another for other, as well as a Database or another, as each Service is owner of their data.

In any case, you will be using CI/CD and your pipeline will be something like this:

  1. Pull the latest code for the Service from the git repository
  2. Compile the code (if needed)
  3. Run the Unit and Integration Tests
  4. Compile the service to an executable artifact (f.e. Java JAR with Tomcat server and other dependencies)
  5. Generate a Machine image with your JAR deployed (for Java. Look at Spotify Docker Plugin to Docker build from Maven), or with Apache, PHP, other dependencies, and the code. Normally will be a Docker image. This image will be immutable. You will probably use Dockerhub.
  6. Machine image will be started. Platform test are run.
  7. If platform tests pass, the service is promoted to the next environment (for example Dev -> Test -> PreProd -> Prod), the exact same machine is started in the next environment and platform tests are repeated.
  8. Before deploying to Production the new Service, I recommend running special Application Tests / Behavior-driven. By this I mean, to conduct tests that really test the functionality of everything, using a real browser and emulating the acts of a user (for example with BeHat, Cucumber or with JMeter).
    I recommend this specially because Microservices are end-points, independent of the implementation, but normally they are API that serve to a whole application. In an Application there are several components, often a change in the Front End can break the application. Imagine a change in Javascript Front End, that results in a call a bit different, for example, with an space before a name. Imagine that the Unit Tests for the Service do not test that, and that was not causing a problem in the old version of the Service and so it will crash when the new Service is deployed. Or another example, imagine that our Service for paying with Visa cards generates IDs for the Payment Gateway, and as a result of the new implementation the IDs generated are returned. With the mocked objects everything works, but when we deploy for real is when we are going to use the actual Bank Payment. This is also why is a good idea to have a PreProduction environment, with PreProduction versions of the actual Services we use (all banks or the GDS for flights/hotel reservation like Galileo or Amadeus have a Test, exactly like Production, Gateway)

If you work with Microsoft .NET, you’ll probably use Azure DevOps.

We IT Engineers, CTOs and Architects, serve the Business. We have to develop the most flexible approaches and enabling the business to release as fast as their need.

Take in count that Microservices is a tool, a pattern. We will use it to bring more flexibility and speed developing, resilience of the services, and speed and independence deploying. However this comes at a cost of complexity.

Microservices is more related to giving flexibility to the Business, and developing according to the Business Domains. Normally oriented to suite an API. If you have an API that is consumed by third party you will have things like independence of Services (if one is down the others will still function), gradual degradation, being able to scale the Services that have more load only, being able to deploy a new version of a Service which is independent of the rest of the Services, etc… the complexity in the technical solution comes from all this resilience, and flexibility.

If your Dev Team is up to 10 Developers or you are writing just a CRUD Web Application, a PoC, or you are an Startup with a critical Time to Market you probably you will not want to use Microservices approach. Is like killing flies with laser cannons. You can use typical Web services approach, do everything in one single Https request, have transactions, a single Database, etc…

But if your team is 100 Developer, like a big eCommerce, you’ll have multiple Teams between 5 and 10 Developers per Business Domain, and you need independence of each Service, having less interdependence. Each Service will own their own Data. That is normally around 5 to 7 tables. Each Service will serve a Business Domain. You’ll benefit from having different technologies for the different needs, however be careful to avoid having Teams with different knowledge that can have hardly rotation and difficult to continue projects when the only 2 or 3 Devs that know that technology leave. Typical benefit scenarios can be having MySql for the Billing Services, but having NoSQL Database for the image catalog, or to store logs of account activity. With Microservices, some services will be calling other Services, often asynchronously, using Queues or Streams, you’ll have Callbacks, Databases for reading, you’ll probably want to have gradual and gracefully failure of your applications, client load balancing, caches and read only databases/in-memory databases… This complexity is in order to protect one Service from the failure of others and to bring it the necessary speed under heavy load.

Here you can find a PDF Document of the typical resources I use for Microservice Projects.

You can also download it from my github repository:

https://github.com/carlesmateo/awesome-microservices

Do you use other solutions that are not listed?. Leave a message. I’ll investigate them and update the Document, to share with the Community.

Update 2020-03-06: I found this very nice article explaining the same. Microservices are not for everybody and not the default option: https://www.theregister.co.uk/AMP/2020/03/04/microservices_last_resort/

Update 2020-03-11: Qcom with 1,600 microservices says that microservices architecture is the las resort: https://www.theregister.co.uk/AMP/2020/03/09/monzo_microservices/