Current location - Loan Platform Complete Network - Big data management - In what can you create a business data center
In what can you create a business data center

You can create a business data center and perform business data center operations and maintenance in the Management Center.

What is a data center? Wikipedia defines a data center as "a complex set of facilities. It consists not only of computer systems and other ancillary equipment (such as communications and storage systems), but also redundant data communications connections, environmental control equipment, monitoring equipment, and various security devices".

In today's cloud-heavy world, data centers are becoming more and more complex as the scale of data center construction continues to expand and new technologies emerge.

Large data centers are often composed of many unit systems with different functions, and their operation and maintenance work requires knowledge of all aspects, including hardware, network, servers, storage, security, and business stuff, which needs to be integrated and linked to do a good job of operation and maintenance.

When the scale of a data center is very large, the technical challenges and problems it faces will be more, a lot of small environments in a small system is not a problem in such a scale is also highlighted, so to do a good job of operation and maintenance of large data centers.

The entire data center in every aspect of the technical system involved in spending a long time to systematically learn, only on this data center as a whole very understanding, in order to target some of the operation and maintenance programs.

The development of some monitoring and operation and maintenance software combined with the specific needs of their own, efficient management and monitoring of the entire data center, to enhance the operational efficiency of the entire data center, reduce the occurrence of failures, so that the operation and maintenance work to continue to push to a new level.

A large data center often contains a lot of small systems within the operation and maintenance work is centered around these specific application systems, specific can be divided into basic operation and maintenance management, daily business operation and maintenance, network, server, storage, security, six major, this article will talk about the general large-scale data centers should have which operation and maintenance methods and capabilities.

First from the data center's basic operation and maintenance management, the main hardware configuration management, maintainability optimization, monitoring, alarm processing, automated operation and maintenance, disconnection, power outages, disaster recovery and other operations and maintenance work. Hardware configuration management includes the model and hardware configuration of each server in the cabinet, and it is clear which business systems are using these servers.

Even in a virtualized operating environment, you need to know what physical machines are flowing through the resource pool. With the sheer number of both physical and virtual machines in the data center, the use of automated operations and maintenance is essential.

Automated O&M not only improves the efficiency of O&M, it also reduces human involvement while freeing up manpower by allowing the data center to manage itself. And it also does a good job of monitoring and alarming possible failures in the data center so that you can know the problem at the first sign of trouble.

Often a major failure is from the beginning of a small fault gradually expanding eventually triggered the collapse of the entire system, so in the emergence of a number of small anomalies must be eliminated in a timely manner, and these anomalies rely on perfect monitoring and alarm system to detect. From the data center's daily business operations and maintenance considerations, there are mainly daily checks, application changes, hardware and software upgrades, unexpected failures and so on.

Specifically:

1, daily inspection: "A thousand miles of dike, collapse in an ant hole". Any failure may be manifested before the emergence of small hidden dangers are not eliminated, may lead to the emergence of major failures, so the daily routine inspection of the data center boring, but also very important, can be found in a timely manner in some of the operation of the hidden dangers.

Depending on the importance of the data center's business, it is important to routinely check all the equipment running in the data center. Check whether the server application service is normal, CPU memory and other utilization rate is normal. Application services are checked to see if they are running normally. There is also the data center room environment should also be checked, the environment of the temperature, humidity, dust is in line with the requirements.

Air conditioning, power supply system for running well, equipment operation whether overheating, flooring, skylight, fire, monitoring are part of the inspection. Air conditioning leakage, equipment leakage will be harmful to the normal and stable operation of the data center, must not be careless.

2, application changes: the data center to carry the business will not be static, with the diversification and continuous development of the business, often have to adjust the business, including servers and network settings. Therefore, the server and network equipment operation is very familiar with the main need to master the Linux server commands and network protocols. Changes must be made in a timely and accurate manner according to the needs of the application.

3, hardware and software upgrades: the general operating cycle of the data center equipment is five years, there is a constant need to gradually phase out the equipment for replacement, there are some equipment because of software deficiencies need to be upgraded, so hardware and software upgrades are also part of the maintenance work. Hardware and software upgrades need to do a good job of the fallback mechanism, in order to prevent the upgrade problems can not fall back, the business can not be recovered for a long time.

When taking over the data center maintenance work will find that how there are so many upgrades, almost every month to have upgrade operations, stay up late to upgrade the work of the maintenance staff has become a common occurrence.

4, sudden failure: no data center is not fault, in the process of data center operation there will be such and such a problem. For sudden failure, high-level maintenance personnel can calmly analyze the cause of the failure of the trigger, and quickly find a solution, if you can not find a solution in a short period of time, but also by switching to standby equipment to restore business, and then analyze.

This is where having a high level of maintenance staff is critical for a data center, and can come in handy in a pinch. While these tasks may seem mundane, don't underestimate them. Routine data center maintenance is actually very important to the normal operation of the entire data center business. Only pay attention to data center maintenance work, can give the data center a peace of mind.

From the data center network considerations, then the main network hardware equipment, ACL, OSPF, LACP, VIP, protocol analysis, traffic, load balancing, two, three, four, seven layers of the situation, network monitoring, 10 Gigabit boards, core switching and so on.

The network is an important part of the data center, is the basic guarantee of all the work running, without the network data center can not run up, so to ensure network stability is the data center operation and maintenance of the work of the top priority. Here we should not only focus on the hardware of the network, but also focus on the SDN software-defined network.

Generally traditional IT architecture in the network, according to business needs after the deployment of online, if the business needs change, to modify the corresponding network equipment (routers, switches, firewalls) on the configuration is a very cumbersome thing.

In today's fast-changing Internet/mobile Internet business environment, high network stability and performance are not enough to meet business needs, and flexibility and agility are even more critical.

What SDN does is to separate the control of the network devices and manage them with a centralized controller.

No need to rely on the underlying network equipment (routers, switches, firewalls), shielded from the differences in the underlying network equipment, while the control is completely open, the user can customize any want to achieve the network routing and transmission rules policy, thus more flexible and intelligent. After the SDN transformation, there is no need to repeatedly configure the router of each node in the network, and the devices in the network itself are automated and connected. Only simple network rules need to be defined at the point of use.

If you don't like the protocols built into the routers themselves, you can also programmatically modify them for better data exchange performance. For example, Baidu's self-developed switches can directly support SDN's remote configuration and management features to enable fully automated configuration online.

In the future, self-developed switches will go further and combine with server automation to improve server delivery and management efficiency. The network can be said to be all-encompassing, involving too many devices and protocols, software layer technology, so also need to continue to learn and deepen the understanding of network technology, so as to do a good job of network operation and maintenance work.

From the data center server side of the consideration, the main file system, kernel parameter tuning, a variety of hard disk drive, kernel version, Kernel panic and so on.

Linux system is not only in the server, in the network operating system also occupies a mainstream position, master the use of Linux system in order to better deal with the server and network equipment operation and maintenance work, Linux is a basic skill of operation and maintenance work. In addition to familiarize with the operation of the Linux system, but also to monitor and manage the server's operating status and kernel operating status, to reduce the occurrence of server failures.

Generally large data centers contain thousands of servers, and almost every day there are servers with a variety of problems, only a deep understanding of the server can well eliminate the problem.

In order to prevent server failures from triggering business interruptions, virtualization or clustering technologies are generally deployed on servers, so that when a server's physical hardware fails, the business can be smoothly switched to other servers, and the business will not be affected in any way. These virtualization technologies increase the difficulty of operation and maintenance, and also require continuous in-depth study of virtualization technologies.

Additionally the customization of data center servers is a very meaningful thing. Cloud computing requires large-scale deployments, so servers need to be more densely deployed, energy-efficient, and easy to manage, but are not as demanding in terms of computing power per node.

The ordinary servers produced by vendors are more concerned with performance and scalability than with cost and energy consumption because they have to adapt to a wide range of applications. If the server is specifically customized for the cloud, it will be optimized for the characteristics of the cloud, and thus more in line with the user's needs.

For enterprises, the benefits are obvious. Imagine, even if the power savings per customized server is limited (4 power supplies instead of 2), the cost savings are obvious in the long run for a massively deployed data center.

For example, the servers that Google has are designed in-house, with customized trays and built-in batteries for backup power, which saves Google big bucks on power expenses, compared to traditional servers that cost and consume much less power.

Considering data center storage, the architecture is more diverse and complex. In the cloud computing, virtualization, big data and other related technologies into the data center, storage has undergone a huge change, block storage, file storage, object storage to support a variety of data types read; centralized storage is no longer the mainstream storage architecture of the data center.

Massive data storage access, the need for scalability, scalability is very strong distributed storage architecture to achieve. In large-scale system support, distributed file system, distributed object storage and other technologies, for the storage of a variety of applications to provide a highly scalable, scalable and great elasticity support and strong data access performance, and because of these distributed technology support for standardized hardware.

Makes large-scale data center storage can be low-cost construction and operation and maintenance. Of course distributed storage is not to replace the existing disk array, but in order to cope with the rapid growth of data volume and bandwidth and the emergence of a new form of storage system. The other is software-defined storage, which represents a trend toward the separation of software and hardware in the storage architecture, that is, the separation of the data layer and the control layer.

For data center users, software to achieve the management and scheduling of storage resources, storage resources to achieve virtualization, abstraction, automation, can be a complete realization of the data center storage system deployment, management, monitoring, adjustment, and other requirements, so that the storage system has a flexible, free and highly available features.

Enterprise and Internet data is growing at a rate of 50% per year, the total amount of structured data in the new data is limited, and most of the data is unstructured and semi-structured data, the data center storage architecture needs to be extremely resilient and adaptable with the development of the business, and the low-cost, massive expansion, and high-concurrency performance are the basic technical attributes of the storage architecture for the operation of large-scale cloud data centers.

How to carry out a large number of large and disorganized data storage and in-depth application processing, and quickly extract valuable information to form business decisions will become the basis for the survival of all types of enterprises, but also the future of storage and around the storage architecture constantly derived from the direction of business development.

Finally, from the data center security to consider, security is a number of small items: attack protection, upgrade backup, catch BUG/find BUG, scripting tools, data security, service patrols and other items, each of which take out actually contains a lot of content.

For example, when it comes to attack and protection, this mainly refers to preventing malicious and unintentional attacks on the data center launched by external abnormal intruders. Malicious attacks are when someone intentionally uses a variety of attack methods to get inside the data center and steal or destroy important data to achieve their ulterior motives.

There are also unintentional attacks, because the entire data center is to maintain interconnectivity with the outside world, the operation is dynamic and changing, there will inevitably be some abnormal traffic attacks on the data center, and sometimes even from within the data center, such as some server poisoning, or hardware failure.

Constructed out of the loop, abnormal traffic and other network failures, these will affect the operation of the data center, so how to do a good job in the data center attack and protection is a big topic, it is not in the data center to deploy a few security devices can be resolved, the need for the entire data center for a comprehensive and unified planning, and targeted deployment of some of the security measures.

And with the enhancement of various hacking techniques, the security protection measures should also be continuously enhanced, which is a process of continuous learning and improvement, as long as the data center is still running, this improvement will not stop. In order to facilitate the operation and maintenance, but also to do a good job of some of the implementation of the script, so that in the event of emergencies, can quickly deal with the problem.

For example, if a data center's business is abnormal, in order to quickly restore the business, the routing needs to be adjusted to direct all the traffic to other data centers, which needs to be adjusted in the core router, then there is a ready-made script can be automatically executed, to achieve the purpose of rapid switching. The data center should also be prepared with scripts for many other jobs so that they can be used quickly in an emergency.

Through the above analysis you must be very surprised, the original data center operations and maintenance contains so many contents, large and small dozens of items, and each of the contents contained in it are not so simple to say, but also involves a lot of technical knowledge. Usually, a data center is the information processing center of a company, enterprise or government department.

Almost all of the business can only be completed through the data center, so the data center for a business or government departments is crucial. And whether a data center can run stably and efficiently, operation and maintenance is the real key. Only when these aspects of the operation and maintenance work is done properly, the data center can be stable for a long time.