掃二維碼與項(xiàng)目經(jīng)理溝通
我們?cè)谖⑿派?4小時(shí)期待你的聲音
解答本文疑問(wèn)/技術(shù)咨詢/運(yùn)營(yíng)咨詢/技術(shù)建議/互聯(lián)網(wǎng)交流
之前在上家公司的時(shí)候做過(guò)一些爬蟲(chóng)的工作,也幫助爬蟲(chóng)工程師解決過(guò)一些問(wèn)題。然后我寫(xiě)過(guò)一些文章發(fā)布到網(wǎng)上,之后有一些人就找我做一些爬蟲(chóng)的外包,內(nèi)容大概是爬取小紅書(shū)的用戶數(shù)據(jù)和商品數(shù)據(jù),但是我沒(méi)做。我覺(jué)得對(duì)于國(guó)內(nèi)的大數(shù)據(jù)公司沒(méi)幾家是有真正的大數(shù)據(jù)量,而是通過(guò)爬蟲(chóng)工程師團(tuán)隊(duì)不斷的去各地爬取數(shù)據(jù),因此不要以為我們的數(shù)據(jù)沒(méi)價(jià)值,對(duì)于內(nèi)容型的公司來(lái)說(shuō),數(shù)據(jù)是可信競(jìng)爭(zhēng)力。那么我接下來(lái)想說(shuō)的就是網(wǎng)絡(luò)和數(shù)據(jù)的安全性問(wèn)題。

成都創(chuàng)新互聯(lián)專注于鎮(zhèn)海企業(yè)網(wǎng)站建設(shè),響應(yīng)式網(wǎng)站設(shè)計(jì),電子商務(wù)商城網(wǎng)站建設(shè)。鎮(zhèn)海網(wǎng)站建設(shè)公司,為鎮(zhèn)海等地區(qū)提供建站服務(wù)。全流程定制網(wǎng)站建設(shè),專業(yè)設(shè)計(jì),全程項(xiàng)目跟蹤,成都創(chuàng)新互聯(lián)專業(yè)和態(tài)度為您提供的服務(wù)
對(duì)于內(nèi)容型的公司,數(shù)據(jù)的安全性很重要。對(duì)于內(nèi)容公司來(lái)說(shuō),數(shù)據(jù)的重要性不言而喻。比如你一個(gè)做在線教育的平臺(tái),題目的數(shù)據(jù)很重要吧,但是被別人通過(guò)爬蟲(chóng)技術(shù)全部爬走了?如果核心競(jìng)爭(zhēng)力都被拿走了,那就是涼涼。再比說(shuō)有個(gè)獨(dú)立開(kāi)發(fā)者想抄襲你的產(chǎn)品,通過(guò)抓包和爬蟲(chóng)手段將你核心的數(shù)據(jù)拿走,然后短期內(nèi)做個(gè)網(wǎng)站和 App,短期內(nèi)成為你的勁敵。
背景
目前通過(guò) App 中的 網(wǎng)頁(yè)分析后,我們的數(shù)據(jù)安全性做的較差,有以下幾個(gè)點(diǎn)存在問(wèn)題:
網(wǎng)站的數(shù)據(jù)通過(guò)最早期的前后端分離來(lái)實(shí)現(xiàn)。稍微學(xué)過(guò) Web 前端的工程師都可以通過(guò)神器 Chrome 分析網(wǎng)站,進(jìn)而爬取需要的數(shù)據(jù)。打開(kāi) 「Network」就可以看到網(wǎng)站的所有網(wǎng)絡(luò)請(qǐng)求了,哎呀,不小心我看到了什么?沒(méi)錯(cuò)就是網(wǎng)站的接口信息都可以看到了。比如 “detail.json?itemId=141529859”。或者你的網(wǎng)站接口有些特殊的判斷處理,將一些信息存儲(chǔ)到 sessionStorage、cookie、localStorage 里面,有點(diǎn)前端經(jīng)驗(yàn)的爬蟲(chóng)工程師心想”嘿嘿嘿,這不是在裸奔數(shù)據(jù)么“。或者有些參數(shù)是通過(guò) JavaScript 臨時(shí)通過(guò)函數(shù)生成的。問(wèn)題不大,工程師也可以對(duì)網(wǎng)頁(yè)元素進(jìn)行查找,找到關(guān)鍵的 id、或者 css 類名,然后在 "Search“ 可以進(jìn)行查找,找到對(duì)應(yīng)的代碼 JS 代碼,點(diǎn)擊查看代碼,如果是早期前端開(kāi)發(fā)模式那么代碼就是裸奔的,跟開(kāi)發(fā)者在自己的 IDE 里面看到的內(nèi)容一樣,有經(jīng)驗(yàn)的爬蟲(chóng)就可以拿這個(gè)做事情,因此安全性問(wèn)題亟待解決。
App 的數(shù)據(jù)即使采用了 HTTPS,但是對(duì)于專業(yè)的抓包工具也是可以直接拿到數(shù)據(jù)的,因此 App 的安全問(wèn)題也可以做一些提高,具體的策略下文會(huì)講到。
爬蟲(chóng)手段
解決方案
制定出Web 端反爬技術(shù)方案
本人從這2個(gè)角度(網(wǎng)頁(yè)所見(jiàn)非所得、查接口請(qǐng)求沒(méi)用)出發(fā),制定了下面的反爬方案。
- # 比如需要正確顯示的數(shù)據(jù)為“19950220”
- 1. 先按照自己需求利用相應(yīng)的規(guī)則(數(shù)字亂序映射,比如正常的0對(duì)應(yīng)還是0,但是亂序就是 0 <-> 1,1 <-> 9,3 <-> 8,...)制作自定義字體(ttf)
- 2. 根據(jù)上面的亂序映射規(guī)律,求得到需要返回的數(shù)據(jù) 19950220 -> 17730220
- 3. 對(duì)于第一步得到的字符串,依次遍歷每個(gè)字符,將每個(gè)字符根據(jù)按照線性變換(y=kx+b)。線性方程的系數(shù)和常數(shù)項(xiàng)是根據(jù)當(dāng)前的日期計(jì)算得到的。比如當(dāng)前的日期為“2018-07-24”,那么線性變換的 k 為 7,b 為 24。
- 4. 然后將變換后的每個(gè)字符串用“3.1415926”拼接返回給接口調(diào)用者。(為什么是3.1415926,因?yàn)閷?duì)數(shù)字偽造反爬,所以拼接的文本肯定是數(shù)字的話不太會(huì)引起研究者的注意,但是數(shù)字長(zhǎng)度太短會(huì)誤傷正常的數(shù)據(jù),所以用所熟悉的 Π)
- ?```
- 1773 -> “1*7+24” + “3.1415926” + “7*7+24” + “3.1415926” + “7*7+24” + “3.1415926” + “3*7+24” -> 313.1415926733.1415926733.141592645
- 02 -> "0*7+24" + "3.1415926" + "2*7+24" -> 243.141592638
- 20 -> "2*7+24" + "3.1415926" + "0*7+24" -> 383.141592624
- ?```
- # 前端拿到數(shù)據(jù)后再解密,解密后根據(jù)自定義的字體 Render 頁(yè)面
- 1. 先將拿到的字符串按照“3.1415926”拆分為數(shù)組
- 2. 對(duì)數(shù)組的每1個(gè)數(shù)據(jù),按照“線性變換”(y=kx+b,k和b同樣按照當(dāng)前的日期求解得到),逆向求解到原本的值。
- 3. 將步驟2的的到的數(shù)據(jù)依次拼接,再根據(jù) ttf 文件 Render 頁(yè)面上。
下面以 Node.js 為例講解后端需要做的事情
- // json
- var JoinOparatorSymbol = "3.1415926";
- function encode(rawData, ruleType) {
- if (!isNotEmptyStr(rawData)) {
- return "";
- }
- var date = new Date();
- var year = date.getFullYear();
- var month = date.getMonth() + 1;
- var day = date.getDate();
- var encodeData = "";
- for (var index = 0; index < rawData.length; index++) {
- var datacomponent = rawData[index];
- if (!isNaN(datacomponent)) {
- if (ruleType < 3) {
- var currentNumber = rawDataMap(String(datacomponent), ruleType);
- encodeData += (currentNumber * month + day) + JoinOparatorSymbol;
- }
- else if (ruleType == 4) {
- encodeData += rawDataMap(String(datacomponent), ruleType);
- }
- else {
- encodeData += rawDataMap(String(datacomponent), ruleType) + JoinOparatorSymbol;
- }
- }
- else if (ruleType == 4) {
- encodeData += rawDataMap(String(datacomponent), ruleType);
- }
- }
- if (encodeData.length >= JoinOparatorSymbol.length) {
- var lastTwoString = encodeData.substring(encodeData.length - JoinOparatorSymbol.length, encodeData.length);
- if (lastTwoString == JoinOparatorSymbol) {
- encodeData = encodeData.substring(0, encodeData.length - JoinOparatorSymbol.length);
- }
- }
- //字體映射處理
- function rawDataMap(rawData, ruleType) {
- if (!isNotEmptyStr(rawData) || !isNotEmptyStr(ruleType)) {
- return;
- }
- var mapData;
- var rawNumber = parseInt(rawData);
- var ruleTypeNumber = parseInt(ruleType);
- if (!isNaN(rawData)) {
- lastNumberCategory = ruleTypeNumber;
- //字體文件1下的數(shù)據(jù)加密規(guī)則
- if (ruleTypeNumber == 1) {
- if (rawNumber == 1) {
- mapData = 1;
- }
- else if (rawNumber == 2) {
- mapData = 2;
- }
- else if (rawNumber == 3) {
- mapData = 4;
- }
- else if (rawNumber == 4) {
- mapData = 5;
- }
- else if (rawNumber == 5) {
- mapData = 3;
- }
- else if (rawNumber == 6) {
- mapData = 8;
- }
- else if (rawNumber == 7) {
- mapData = 6;
- }
- else if (rawNumber == 8) {
- mapData = 9;
- }
- else if (rawNumber == 9) {
- mapData = 7;
- }
- else if (rawNumber == 0) {
- mapData = 0;
- }
- }
- //字體文件2下的數(shù)據(jù)加密規(guī)則
- else if (ruleTypeNumber == 0) {
- if (rawNumber == 1) {
- mapData = 4;
- }
- else if (rawNumber == 2) {
- mapData = 2;
- }
- else if (rawNumber == 3) {
- mapData = 3;
- }
- else if (rawNumber == 4) {
- mapData = 1;
- }
- else if (rawNumber == 5) {
- mapData = 8;
- }
- else if (rawNumber == 6) {
- mapData = 5;
- }
- else if (rawNumber == 7) {
- mapData = 6;
- }
- else if (rawNumber == 8) {
- mapData = 7;
- }
- else if (rawNumber == 9) {
- mapData = 9;
- }
- else if (rawNumber == 0) {
- mapData = 0;
- }
- }
- //字體文件3下的數(shù)據(jù)加密規(guī)則
- else if (ruleTypeNumber == 2) {
- if (rawNumber == 1) {
- mapData = 6;
- }
- else if (rawNumber == 2) {
- mapData = 2;
- }
- else if (rawNumber == 3) {
- mapData = 1;
- }
- else if (rawNumber == 4) {
- mapData = 3;
- }
- else if (rawNumber == 5) {
- mapData = 4;
- }
- else if (rawNumber == 6) {
- mapData = 8;
- }
- else if (rawNumber == 7) {
- mapData = 3;
- }
- else if (rawNumber == 8) {
- mapData = 7;
- }
- else if (rawNumber == 9) {
- mapData = 9;
- }
- else if (rawNumber == 0) {
- mapData = 0;
- }
- }
- else if (ruleTypeNumber == 3) {
- if (rawNumber == 1) {
- mapData = "";
- }
- else if (rawNumber == 2) {
- mapData = "";
- }
- else if (rawNumber == 3) {
- mapData = "";
- }
- else if (rawNumber == 4) {
- mapData = "";
- }
- else if (rawNumber == 5) {
- mapData = "";
- }
- else if (rawNumber == 6) {
- mapData = "";
- }
- else if (rawNumber == 7) {
- mapData = "";
- }
- else if (rawNumber == 8) {
- mapData = "";
- }
- else if (rawNumber == 9) {
- mapData = "";
- }
- else if (rawNumber == 0) {
- mapData = "";
- }
- }
- else{
- mapData = rawNumber;
- }
- } else if (ruleTypeNumber == 4) {
- var sources = ["年", "萬(wàn)", "業(yè)", "人", "信", "元", "千", "司", "州", "資", "造", "錢"];
- //判斷字符串為漢字
- if (/^[\u4e00-\u9fa5]*$/.test(rawData)) {
- if (sources.indexOf(rawData) > -1) {
- var currentChineseHexcod = rawData.charCodeAt(0).toString(16);
- var lastCompoent;
- var mapComponetnt;
- var numbers = ["0", "1", "2", "3", "4", "5", "6", "7", "8", "9"];
- var characters = ["a", "b", "c", "d", "e", "f", "g", "h", "h", "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z"];
- if (currentChineseHexcod.length == 4) {
- lastCompoent = currentChineseHexcod.substr(3, 1);
- var locationInComponents = 0;
- if (/[0-9]/.test(lastCompoent)) {
- locationInComponents = numbers.indexOf(lastCompoent);
- mapComponetnt = numbers[(locationInComponents + 1) % 10];
- }
- else if (/[a-z]/.test(lastCompoent)) {
- locationInComponents = characters.indexOf(lastCompoent);
- mapComponetnt = characters[(locationInComponents + 1) % 26];
- }
- mapData = "" + currentChineseHexcod.substr(0, 3) + mapComponetnt + ";";
- }
- } else {
- mapData = rawData;
- }
- }
- else if (/[0-9]/.test(rawData)) {
- mapData = rawDataMap(rawData, 2);
- }
- else {
- mapData = rawData;
- }
- }
- return mapData;
- }
- //api
- module.exports = {
- "GET /api/products": async (ctx, next) => {
- ctx.response.type = "application/json";
- ctx.response.body = {
- products: products
- };
- },
- "GET /api/solution1": async (ctx, next) => {
- try {
- var data = fs.readFileSync(pathname, "utf-8");
- ruleJson = JSON.parse(data);
- rule = ruleJson.data.rule;
- } catch (error) {
- console.log("fail: " + error);
- }
- var data = {
- code: 200,
- message: "success",
- data: {
- name: "@杭城小劉",
- year: LBPEncode("1995", rule),
- month: LBPEncode("02", rule),
- day: LBPEncode("20", rule),
- analysis : rule
- }
- }
- ctx.set("Access-Control-Allow-Origin", "*");
- ctx.response.type = "application/json";
- ctx.response.body = data;
- },
- "GET /api/solution2": async (ctx, next) => {
- try {
- var data = fs.readFileSync(pathname, "utf-8");
- ruleJson = JSON.parse(data);
- rule = ruleJson.data.rule;
- } catch (error) {
- console.log("fail: " + error);
- }
- var data = {
- code: 200,
- message: "success",
- data: {
- name: LBPEncode("建造師",rule),
- birthday: LBPEncode("1995年02月20日",rule),
- company: LBPEncode("中天公司",rule),
- address: LBPEncode("浙江省杭州市拱墅區(qū)石祥路",rule),
- bidprice: LBPEncode("2萬(wàn)元",rule),
- negative: LBPEncode("2018年辦事效率太高、負(fù)面基本沒(méi)有",rule),
- title: LBPEncode("建造師",rule),
- honor: LBPEncode("最佳獎(jiǎng)",rule),
- analysis : rule
- }
- }
- ctx.set("Access-Control-Allow-Origin", "*");
- ctx.response.type = "application/json";
- ctx.response.body = data;
- },
- "POST /api/products": async (ctx, next) => {
- var p = {
- name: ctx.request.body.name,
- price: ctx.request.body.price
- };
- products.push(p);
- ctx.response.type = "application/json";
- ctx.response.body = p;
- }
- };
- //路由
- const fs = require("fs");
- function addMapping(router, mapping){
- for(var url in mapping){
- if (url.startsWith("GET")) {
- var path = url.substring(4);
- router.get(path,mapping[url]);
- console.log(`Register URL mapping: GET: ${path}`);
- }else if (url.startsWith('POST ')) {
- var path = url.substring(5);
- router.post(path, mapping[url]);
- console.log(`Register URL mapping: POST ${path}`);
- } else if (url.startsWith('PUT ')) {
- var path = url.substring(4);
- router.put(path, mapping[url]);
- console.log(`Register URL mapping: PUT ${path}`);
- } else if (url.startsWith('DELETE ')) {
- var path = url.substring(7);
- router.del(path, mapping[url]);
- console.log(`Register URL mapping: DELETE ${path}`);
- } else {
- console.log(`Invalid URL: ${url}`);
- }
- }
- }
- function addControllers(router, dir){
- fs.readdirSync(__dirname + "/" + dir).filter( (f) => {
- return f.endsWith(".js");
- }).forEach( (f) => {
- console.log(`Process controllers:${f}...`);
- let mapping = require(__dirname + "/" + dir + "/" + f);
- addMapping(router,mapping);
- });
- }
- module.exports = function(dir){
- let controllers = dir || "controller";
- let router = require("koa-router")();
- addControllers(router,controllers);
- return router.routes();
- };
- $("#year").html(getRawData(data.year,log));
- // util.js
- var JoinOparatorSymbol = "3.1415926";
- function isNotEmptyStr($str) {
- if (String($str) == "" || $str == undefined || $str == null || $str == "null") {
- return false;
- }
- return true;
- }
- function getRawData($json,analisys) {
- $json = $json.toString();
- if (!isNotEmptyStr($json)) {
- return;
- }
- var date= new Date();
- var year = date.getFullYear();
- var month = date.getMonth() + 1;
- var day = date.getDate();
- var datacomponents = $json.split(JoinOparatorSymbol);
- var orginalMessage = "";
- for(var index = 0;index < datacomponents.length;index++){
- var datacomponent = datacomponents[index];
- if (!isNaN(datacomponent) && analisys < 3){
- var currentNumber = parseInt(datacomponent);
- orginalMessage += (currentNumber - day)/month;
- }
- else if(analisys == 3){
- orginalMessage += datacomponent;
- }
- else{
- //其他情況待續(xù),本 Demo 根據(jù)本人在研究反爬方面的技術(shù)并實(shí)踐后持續(xù)更新
- }
- }
- return orginalMessage;
- }
比如后端返回的是323.14743.14743.1446,根據(jù)我們約定的算法,可以的到結(jié)果為1773
上面計(jì)算的到的1773,然后根據(jù)ttf文件,頁(yè)面看到的就是1995
JS混淆工具
個(gè)人覺(jué)得這種方式還不是很安全。于是想到了各種方案的組合拳。比如
反爬升級(jí)版
個(gè)人覺(jué)得如果一個(gè)前端經(jīng)驗(yàn)豐富的爬蟲(chóng)開(kāi)發(fā)者來(lái)說(shuō),上面的方案可能還是會(huì)存在被破解的可能,所以在之前的基礎(chǔ)上做了升級(jí)版本
這幾種組合拳打下來(lái)。對(duì)于一般的爬蟲(chóng)就放棄了。
反爬手段再升級(jí)
上面說(shuō)的方法主要是針對(duì)數(shù)字做的反爬手段,如果要對(duì)漢字進(jìn)行反爬怎么辦?接下來(lái)提供幾種方案
本人將方案1實(shí)現(xiàn)到 Demo 中了。
關(guān)鍵步驟
- //style.css
- @font-face {
- font-family: "NumberFont";
- src: url('http://127.0.0.1:8080/Util/analysis');
- -webkit-font-smoothing: antialiased;
- -moz-osx-font-smoothing: grayscale;
- }
- @font-face {
- font-family: "CharacterFont";
- src: url('http://127.0.0.1:8080/Util/map');
- -webkit-font-smoothing: antialiased;
- -moz-osx-font-smoothing: grayscale;
- }
- h2 {
- font-family: "NumberFont";
- }
- h3,a{
- font-family: "CharacterFont";
- }
傳送門
字體制作的步驟、ttf轉(zhuǎn)svg、字體映射規(guī)則
實(shí)現(xiàn)的效果
頁(yè)面上看到的數(shù)據(jù)跟審查元素看到的結(jié)果不一致
去查看接口數(shù)據(jù)跟審核元素和界面看到的三者不一致
頁(yè)面每次刷新之前得出的結(jié)果更不一致
對(duì)于數(shù)字和漢字的處理手段都不一致
這幾種組合拳打下來(lái)。對(duì)于一般的爬蟲(chóng)就放棄了。
前面的 ttf 轉(zhuǎn) svg 網(wǎng)站當(dāng) ttf 文件太大會(huì)限制轉(zhuǎn)換,讓你購(gòu)買,下面貼出個(gè)新的鏈接。
ttf轉(zhuǎn)svg
Demo 地址
運(yùn)行步驟
- //客戶端。先查看本機(jī) ip 在 Demo/Spider-develop/Solution/Solution1.js 和 Demo/Spider-develop/Solution/Solution2.js 里面將接口地址修改為本機(jī) ip
- $ cd Demo
- $ ls
- REST Spider-release file-Server.js
- Spider-develop Util rule.json
- $ node file-Server.js
- Server is runnig at http://127.0.0.1:8080/
- //服務(wù)端 先安裝依賴
- $ cd REST/
- $ npm install
- $ node app.js
App 端安全的解決方案
關(guān)于 Hybrid 的更多內(nèi)容,可以看看這篇文章 Awesome Hybrid
- var requestObject = {
- url: arg.Api + "SearchInfo/getLawsInfo",
- params: requestparams,
- Hybrid_Request_Method: 0 本文題目:大前端時(shí)代安全性如何做
URL分享:http://uogjgqi.cn/article/cdcgpje.html

我們?cè)谖⑿派?4小時(shí)期待你的聲音
解答本文疑問(wèn)/技術(shù)咨詢/運(yùn)營(yíng)咨詢/技術(shù)建議/互聯(lián)網(wǎng)交流