I entered the Internet industry is completely zero basis, not data analysis zero basis, is a kind of ability zero basis.
Zero basis to what? It took me three to four months to find a job, and I ended up starting as an operator.
I've never been a strong mathematical person, although the university has studied high math, statistics, SQL and C language, are skimmed over the low space, the test also used the power of the small partners. Looking back now, I should have studied more at that time.
In the beginning, I will not vlookup, no one taught me, Excel can only do basic operations. At that time to associate more than one statement, I rely on hand fast, one by one search copy and paste ... ... a lot of data will certainly cry. Later I thought this is not the way ah. So with the help of the almighty Baidu:
"Excel how to match the data of multiple tables."
Then the first time I saw the vlookup function. I also did not learn once, every time I use to read through the online samples first. When I subsequently taught the group, they learned it much faster than I did.
Excel one step at a time, learning is dependent on the search and pondering, taking time to practice analyzing with the content of the work: for example, what kind of users are willing to use our APP, the user which indicators are particularly good.
Even during this period, I do not know the pivot table.
I remember in early 15, the boss gave me a task: online collection of data, about tens of thousands of articles, I can not copy and paste down all ah, then continue to query:
How to quickly download the data on the web.
So I knew about crawlers and Python, but I didn't know. Finally relied on third party crawler tools and followed tutorials to learn. Early have learned HTML + CSS, and then understand the structure of the web page, learn Get / Post, learn regular. It took a week of overtime to download it.
But there is no end, the data is dirty data, I still need to clean. Spend another week learning Excel find, right, mid, replace, trim and other text processing functions. I didn't know it was called data cleansing at the time, but I learned a lot of tricks, and even though I tried to do it as fast as possible to save energy, it still took days.
When I write Python crawler now, the efficiency is much faster. Including text cleaning, which is blazing fast with Levenshtein. It adds up to one night.
Any learning is not useless, a lot of knowledge connected. I learned HTML+CSS because of the crawler, and then I learned about site structure and web analytics by touch.
Following the arrangement of Baidu statistics, know JS, learn all kinds of indicators on the web side, to understand the access path, funnel conversion, bounce rate exit rate, and so on. This knowledge can be used in more than just the website. Can also be used in APP analysis, user behavior.
We take learning as a point, after learning this book to read the next book, in fact, this does not play the efficiency of learning. Any knowledge has relevance, A knowledge can be applied to B knowledge, knowledge skills tree should be a net-like dispersion.
The above chain is a spectrum of relationships that I use to acquire new knowledge based on prior knowledge. Data analytics covers a wide range of areas, and in addition to its own solid business background, it also requires a Swiss army knife-like skill tree, which is a T-type competency (multi-talent).
For example, you see that a certain page has a high bounce rate. In addition to the regular analysis, you have to check the network speed, the user's weak network environment, whether the HTML page is loaded too much, whether the use of caching, how the network DNS, and so on. This knowledge won't be taught to you, but it shapes business results.
See here don't be afraid, although to learn more, but with the deepening of learning, a lot of knowledge is **** through. Just like the conversion rate comes from the website analysis, but can be used for the product path, can be sublimated to both Sankey diagram, but also can do user stratification. The more you learn the later, the easier it is to one way to one way.
Drive
In fact, zero basis to learn data analysis, the most difficult threshold is not the skills, but learning motivation. I have trained data analysts from scratch, taught Excel from scratch, SQL from scratch, analytical thinking from scratch, and Python from scratch. the difficulty never lies in this knowledge, but whether you really want to learn.
Not to download a dozen G of information is to learn, not to pay attention to a lot of public number is to learn. Because more than a dozen G of information will not end up opening, many public numbers end up unread. This can show that want to learn? Zero foundation is too easy to start, it is difficult to adhere to, shallow.
No way to start, this is not know what to learn, I said data analytics is a broader discipline. It has the methodology of traditional business analytics as well as the statistics and programming of the data age. But it also happens to be a skill that can be used in any position in any career, can't get around it.
Learning is a very subjective thing, we start from elementary school to university, decades of student life, the most missing ability is active learning. Secondary school entrance examination polishing so many years, a large number of cases is the environmental factors forced people to learn, itself does not have any learning drive and habits. University four years again once over, may be learning to wear out.
The reason why we say that we are used to passive learning, is that we all have a topic to do a topic, only know the formula application, do not need to know the principle. Textbook counseling topics, the content will not be super outline. The whole big learning environment is built for passive.
Now study data analysis, pick up the books, open the PDF information, pay attention to the public number. There will be no teacher to correct you to counsel you, there will be no homework to spur you to train you. Also do not know which will be frequently used in the work, there is no practice data topics, and even the quality of knowledge on the network is difficult to identify.
Nothing to do, right, but that's active learning.
The mindset has to change.
Zero basis to learn data analysis, the biggest teacher can only be their own, there will not be any article overnight to teach people to become a data analyst. I've taken interns who are willing to learn and grow quickly, and I've taught colleagues who are interested but still can't carry the pace. The former is active learning, the latter is passive learning that stops at interest.
Because there is no foundation, it is more important to be proactive. Data analytics is a fast-growing industry, a few years ago will SQL on the line, now you have to understand some MR and HIVE, in a few years SparkSQL may be a must, if you want to do a better job in this line. Continuous learning is a must have ability. Or the foundation is not as good as others, at least learn gender lose it.
I also give my advice, learning should be specific for the solution of a problem and set up a goal, to put it bluntly, the actual combat is king. Regardless of the profession, must be more or less able to touch the data. Don't analyze it first, but think, what can be done with this data, make a simple hypothesis.
I'm in HR, and my hypothesis is that it's getting more and more difficult to recruit these days,
I'm in marketing, and my hypothesis is that it's too expensive to market, and it's not effective.
I'm Ops or Product, and my assumption is that I can't improve a certain metric because of ABC's and so on.
Even if I'm a student, I can assume whether it's easy or difficult to make money in the school business district.
Data is collected, generated, combined, utilized, argued and analyzed around assumptions. It's a McKinsey-esque way of thinking, and it can also be used as a way to learn data. Newcomers are prone to fall into the data maze: I don't have the data, and if I have the data, I don't know what to do, and if I know what to do, I don't know how to do it. Thinking too much is far less useful than having direction.
The benefit of being hypothesis-based is that I have a direction to go in the first place, never mind whether it's right or not, at least I can do the analysis according to the direction.
HR that recruiting is becoming more and more difficult, then you can come up with historical data, I used to recruit people need to download a few resumes, make a few phone calls, send out a few Offer eventually on board. Now? I can also take the data of each link to observe, this is not the conversion rate? The time dimension to put a little wider, look at this time last year to recruit difficult, is not the end of the year are difficult to recruit, so that understand the concept of line graph.
Market specialists to do analysis, you can take more data for reference, assuming that the marketing costs are too high, and now high to what point, when the high, to find out the time to analyze. The effect is not good, is when the effect is not good, then the market environment has changed? I'm assuming that the market environment has changed, which is another new assumption to keep picking up a bunch of in-depth research.
While the efficiency and results of each person's analysis are certainly different, the ideas can be trained in this way. It's not the data that makes the analysis possible, it's the direction of the analysis that makes it possible to collect and analyze the data. My learning has never been problem solving oriented, not just a sudden flash of insight.
If you think of the learning journey of data analytics as a very long road, we don't drive all the way to the end, which no one can do. Rather, we divide the road into segments, place a flag on top of each segment as a goal, and use the flag as the direction of travel, not as the goal of a finish line dozens of kilometers away.
Curiosity
Besides the drive to learn, you need a curious mind to become a data analyst.
Curiosity is about asking questions, thinking about problems, mulling them over, and solving them. If you are a naturally nosy person, then use it in data analysis is definitely a heavenly chosen analyst, a good material.
Many people like to pursue the tools, knowledge, points, tricks of data analysis. But curiosity is seldom mentioned.
Curiosity is a core problem-solving ability, programming can be exercised, statistics can be learned, these are ultimately not the bottleneck. What do you ultimately need when you've learned all 18 skills and are facing hostile battle? It's curiosity. The winning mind of data is curiosity.
Knowledge determines the lower limit of problem solving, curiosity determines the upper limit of problem solving. A good data analyst will be curious, ask questions, think about problems, and be able to solve them.
All the activities we pushed in the earliest days had no monitoring system, and the whole operation lacked data guidance. For me at the time, a lot of operations were run in a black box. I didn't know what was sent what and how it happened, there was only one result output.
If others ask me to ask what, I can only make assumptions, there may be one, two, three points. Whether that is the case or not, I don't know.
Operational activity numbers are up, why? No idea.
What were the results after the SMS push? No idea.
What are the sources of new signups? Don't know.
At that time as the company's business lines expanded and the number of users increased. I used Excel to do the correlation is more and more strenuous. I once again to the R & D to mention the data needs, the CTO said to me: Why don't you open a database access, you check it yourself.
I said goodbye to Excel and learned and understood the database. Expanded my exposure from a few tables to hundreds of tables.
Know the difference between left join and inner join.
I knew group by, data structure, and index.
That period of time, I needed to establish a user data system, including retention, active, return, tiering and other indicators. I looked up the application and explanation of operational metrics online while checking the SQL implementation.
Explaining and communicating with R&D, because of the understanding of the database, many demands were realized with more reasonable requirements. This is the first time I started to contact, understand and build a business-centered data system.
An example: a user has used APP for a long time, we call him a loyal user, then suddenly he doesn't use it for several weeks in a row, then we will find out this kind of user through SQL, analyze his behavior, telephone interview why he doesn't use it, and try to call him back. Same thing for all other operations.
That's when I can say I understand the activity count, why it went up and why it went down.
We push SMS to different users, and with SQL I can query how good the data is, but is there a clearer metric? For example, how many users opened the app because of SMS, and what was the SMS open rate?
At that time, we used url scheme for shortlinks, which can automatically jump to the app, and we also buried parameters in the shortlinks for monitoring purposes. By pushing the data, observe how many people will open this SMS.
This is a measure of a copywriter, a good copywriter must be able to touch the user to open. We often use copywriting as an AB test. As an example, we will use SMS marketing, the operation is linked to the gift, at that time there are a lot of users online after registration and do not download the APP, we have so a text message for such text messages:
丨We have prepared an exclusive heart for you, XXXXX, please open the APP to receive.
This SMS has an open rate of about 10%. But there is still room for optimization, so I kept modifying the text, and subsequently modified it to:
丨Since you've already registered, why don't you come and get your exclusive thoughts,XXXXX, please open the APP to get it (the middle content remains unchanged).
The open rate was optimized to 18%. Because it used the marketing psychology, has been registered, fits the implication of the cost of silence: I do all do, why not continue, or white registration. This kind of psychology is common in tourist attractions, attractions are very pitiful, but the vast majority of people will still say: since come are here, is a kind of **** through the psychology.
Follow-up SMS and take personalized program, eventually optimized to 25%. About three times better than the earliest text message results. If you are not curious about the effect of SMS, if you do not collect data to monitor the metrics, then optimization can not be said. We may write good copy by feel, but you don't know the exact effect, and data can.
Another example, in the beginning, we use WeChat friends circle to pull new users, at first there are multiple channels, but I do not know which channel is effective. Then my curiosity got the better of me, which channel works well? Can the invitation conversion rate still be optimized? What is the cost of the channel?
Still promoting and landing the implementation of data analysis, because WeChat's web page sharing, will automatically bring from=timeline and other parameters, through the parameters I can filter out the WeChat end of the browsing and access to the data. Later, I asked R&D to set parameters for different channels. Through the parameter statistics conversion rate, and to the new user playing channel source tag.
During the period, I found that the conversion rate of one channel was too low. We probably have two types of channels, one is the landing page directly invites users to register, additional gift information. One is to let the user pick the gift style first, and finally jump to registration in the collection step. Through the conversion rate analysis, the latter loss is more serious. Because the steps are too redundant, there is a courier address to fill in, the attraction of picking a gift is not enough to support the user to go through the process.
So we changed the process for the second channel. The source of users of different registration channels, because there are labels, so in the subsequent operation of new users, can be targeted to do measures. This is one of the reasons why SMS reaches 25% open rates through personalization.
Curiosity serves to solve problems. By constantly thinking about problems and solving them, the ability related to data analysis will naturally improve.
Fortunately, curiosity can be exercised later in life, which is to ask more questions and think more about problems, and the exercise is not very difficult.
Non-data
The other problem with zero-based learning is that it trivializes the importance of business.
In reality, the difficulty in trying to become a data analyst is not a lack of knowledge of Excel, SQL, statistics, etc. Rather, it's the lack of business knowledge.
One understands business but not data, another understands data but not business, and the former is more likely to solve real problems. Because data analysts are always there for the business.
I had proposed to the product (no dinner invitation) to arrange APP and Web buried points, through the user's path to understand the user, but also to make up for the shortcomings of Baidu statistics.
At that time, the data was stored through Hadoop, and Hive was used to establish offline script cleaning, partitioning, and processing. Users browse the product page, use the function, stay time can form the basis of user profiling.
I used to wonder what a user profile was, because the web says that a user's gender, geography, age, marriage, finances, interests, and preferences form the basis of a user profile. But we can't get that much data for our business. And I think that user profiling is for business service, it should not have a strict uniform standard. As long as it works well in business, it's a good user profile.
Just like the user profile of an online video will collect the actors, release time, origin, language, and genre of the movie. It also breaks down whether the user fast-forwards and drags and drops. All of this is business-oriented. Even the analysts of the video sites themselves have to read a lot of movies in order to analyze them based on business. Otherwise, there are so many movie categories and genres, how to break down the various types of indicators? Can drag and drop fast in to determine whether the user is interested, themselves have to have used a similar behavior to understand.
How to learn the industry and business knowledge with zero foundation? If itself and business contact, just want to do data analysis, the difficulty is quite a bit smaller. If, like me in the beginning, there is neither obligatory knowledge nor data, it is also possible.
Data if learned by hypothetical thinking, then business should be learned by systematic thinking. Business knowledge also needs a purpose and direction, but it's not the same as data analysis. Business focuses on systematic, systematic is not big and comprehensive, but top-down structural knowledge. First target a direction to drill depth, the breadth will gradually expand with the depth of the excavation.
For example, if you are a layman and want to learn the analysis of the user operation system, don't first consider what is user operation, which is too big a problem. Instead, aim in one direction, such as activity, understand its definition and meaning, and then think about how to apply it. How to define the activity degree of offline shopping malls, how to define the activity degree of hospital patients, how to define the activity degree of a certain school club? Take examples from around you to think about activity. Activity in a shopping mall can be the flow of people walking around, the flow of customers making consumer purchases, or the tycoons with big bags. What factors affect activity? Promotions or discounts, holidays or geography. Once these questions are figured out, getting up to speed with user operations will be quick.
Then think about retention and pulling in new people through the same thinking. You will know that if the mall's foot traffic continues to come and spend money next time, it is retention, and if new guests come, it is pulling in new ones. What are the factors that interact with each other in this? The final knowledge thinking must be a pyramid structure. The upper layer is user operation, the middle is pulling new, active, retention. The lower level is the individual points and elements.
The learning of data analysis focuses on deduction and reasoning, and the learning of business focuses on correlation and application. Curiosity and assumptions are also used during this period, both of which are part of the way to accelerate learning.
Actually, for those who want to become a data analyst from scratch, there may still be some cloud cover. These soft skills will not help people get to heaven in one step, in fact, the seven weeks to become a data analyst, from the very beginning I also said that it is an introductory syllabus. The important thing is whether they really want to learn and learn well, the master led into the door, practice depends on the individual, everything else is false.
Remembered a long time ago to see a chicken soup words, when you want to go forward, everything will make way for you. I think that's more powerful than anything I've ever said.
So you ask me can you become a data analyst with zero foundation? My answer is yes you can.
The article actually has some catching up to do, so I'll end by wishing you all a Merry Christmas.