Current location - Loan Platform Complete Network - Big data management - What is website architecture
What is website architecture
Website architecture, generally considered to be based on the results of customer demand analysis, accurate positioning of the site's target groups, set the overall structure of the site, planning, design of the site columns and their content, the formulation of the site development process and sequence, in order to maximize the efficient allocation of resources and management of the design. Its content has three manifestations: program architecture, presentation architecture, and information architecture. And the steps are mainly divided into hard architecture and soft architecture two-step program. Network architecture is a necessary basic technology for modern network learning and development.

Chinese name

Web architecture

Generally considered

According to the results of the analysis of customer needs

Development

Website development process and sequence

Content

Program architecture, presenting the architecture

Quick

Navigation

Soft architecture eight Program

Hard Architecture

Choice of server room

When choosing a server room, depending on the geographical distribution of website users, you can choose Netcom or Telecom server room, but more often than not, it may be dual-line server room is appropriate. The larger the city, the more expensive the server room, from the point of view of cost can be in some small and medium-sized cities hosting servers, for example, Beijing companies can consider hosting servers in Tianjin, Langfang, etc., is not particularly far away, but the price will be much cheaper.

The size of the bandwidth

Often when the boss pays us to structure the site, he will give us some goals, such as the site should be able to withstand 1 million PV visits per day and so on. At this point, we have to budget about how much bandwidth is needed to calculate the bandwidth size mainly involves two indicators (peak traffic and page size), we may wish to make the necessary assumptions before calculating:

First: assuming that the peak flow of traffic is five times the average flow.

Second: assume that the average page size per visit is about 100K bytes.

If 1 million PV visits in a day, the average distribution of words, converted to about 12 visits per second, if the average size of each visit to the page is 100K bytes or so, then the 12 visits is about 1200K bytes, byte unit is Byte, and bandwidth is the unit of bit, the relationship between them is 1 Byte = 8bit, so 1200K bytes, and bandwidth is the unit of bit, and bandwidth is the unit of byte, so 1200K bytes. The relationship between them is 1Byte = 8bit, so 1200K Byte is roughly equivalent to 9600K bit, that is, the appearance of 9Mbps, the actual situation, our website must be able to maintain a normal access to the peak traffic, so in accordance with the assumption of the peak traffic, the real bandwidth demand should be in the 45Mbps or so.

Of course, this conclusion is based on the two assumptions mentioned earlier, and if your actual situation is different from these two assumptions, then the results will be different.

Division of servers

Let's start by looking at what servers we all need: image servers, page servers, database servers, application servers, log servers, and so on.

For a website with a large number of visitors, it is necessary to separate separate image servers and page servers, we can use lighttpd to run image servers, and apache to run page servers, of course, you can also choose other, or even, we can expand into many image servers and many page servers and set up related domain names, such as img. domain and www.domain, the path of the picture in the page use absolute path, such as <img src="http://img.domain/abc.gif" />, and then set the DNS round robin, to achieve the most primary load balancing. Of course, more servers will inevitably involve a synchronization problem, this can use rsync software to get it done.

The database server is the most important thing, because the bottleneck of the site in nine out of ten cases is out of the database. The general small and medium-sized site more use MySQL database, but its clustering capabilities do not seem to have reached the stage of STABLE, so do not comment here. In general, the use of MySQL database, we should engage in a master-slave (master-multiple-slave) structure, the master database server to use innodb table structure, from the data server to use myisam table structure, to give full play to their respective advantages, and such a master-slave structure to separate the read and write operations, reducing the pressure of the read operation, and we can even set up a dedicated slave server as a backup server to facilitate backups. server as a backup server for easy backups. Otherwise, if you only have a master server, mysqldump is basically out of the question in the case of a large amount of data, and if you directly copy the data files, you have to stop the database service before copying, otherwise the backup file will be wrong. But for many websites, even if the database service is only stopped for a second is unacceptable. If you have a slave database server, when backing up data, you can stop the service first (slave stop) and then back up, and then start the service (slave start), the slave server will automatically synchronize the data from the master server, and everything will not be affected. However, the master-slave architecture also has a fatal drawback, that is, the master-slave architecture only reduces the pressure of read operations, but can not reduce the pressure of write operations.

To accommodate larger scales, there may be only one last trick left: splitting the database horizontally/vertically. The so-called horizontal partitioning of the database is to save different tables to different database servers, for example, the user table is stored in database server A, the articles table is stored in database server B, of course, such a partition has a price, the most basic is that you can not do LEFT JOIN and other operations. The so-called vertical partitioning of the database, generally refers to the user identification (user_id) and so on to divide the data storage servers, for example: we have five database servers, then "user_id % 5 + 1" is equal to 1 will be saved to the first server, equal to 2 will be saved to the second server, and so on, and so on. There are many different principles of vertical segregation that can be chosen depending on the situation. However, as with the horizontal partitioning of the database, the vertical partitioning of the database also has a price, the most basic is that we will have a lot of trouble when we perform operations such as COUNT, SUM and other summary operations. To summarize, the database server solution in general, depending on the situation tends to be a mixture of programs to take advantage of the advantages of various programs, and sometimes need to resort to third-party software such as memcached, in order to adapt to the requirements of a larger number of access.

It would be ideal to have a dedicated application server to run PHP scripts, so that our page servers only hold static pages, and we can set up domain names such as app.domain for the application server to differentiate it from the page server. For the application server, I still prefer to use prefork mode apache, with the necessary xcache and other PHP caching software, load the module to the less the better, in addition to mod_rewrite and other necessary modules, unnecessary things are discarded to minimize the memory consumption of the httpd process, and those image servers, page servers and so on. Static content can use lighttpd or tux to get, give full play to the characteristics of various servers.

If the conditions allow, independent log server is also necessary, the general practice of small sites are the page server and log server into one, in the early morning when the number of visitors is not much cron run the previous day's log calculations, however, if you use awstats and other log analysis software, for millions of visitors, even if the archive by day, will consume a lot of time and server resources to calculate. time and server resources to calculate, so there is still a benefit to separating separate log servers so that it does not affect the working status of the official server.

Soft Architecture

Framework selection

There are many choices for PHP frameworks, such as CakePHP, Symfony, Zend Framework, etc. There is no single answer as to which one should be used, depending on the extent to which the team members in the Team understand each framework. Often, even if you don't use a framework, you can still write a good program, for example, Flickr is said to be written with Pear+Smarty class libraries, so whether or not to use a framework, with what framework, is generally not the most important thing, the important thing is that we have to have a framework for the programming mindset of the consciousness.

Layering of logic