鱼C论坛

 找回密码
 立即注册
查看: 2823|回复: 14

[技术交流] [python解密]字符替换解密

[复制链接]
发表于 2017-2-24 11:06:30 | 显示全部楼层 |阅读模式

马上注册,结交更多好友,享用更多功能^_^

您需要 登录 才可以下载或查看,没有账号?立即注册

x
前几天和同事玩了一个字符替换解密的游戏,我觉得挺有趣的,拿来分享给大家。

前几天同事在一个英文报纸上看到一篇解谜游戏的文章,大意就是有一段英语文章被字符串替换了(包含大小写),请读者解密。

原文:
  1. pz ZcG pkHMcG, eY zgm 1950G, H rQcIq cx QmGmHQWgmQG kmu iB AmzQcqckeG, eYWkIueYr hcgY JcY vmIMHYY HYu tzHYeGkHl PkHM, umJmkcqmu zgm AcYzm yHQkc Mmzgcu. KmYmQHkkB GqmHXeYr, zgm AcYzm yHQkc Mmzgcu eG H GzHzeGzeWHk HqqQcHWg zc GckJm umzmQMeYeGzeW MHYB-icuB qQcikmMG. VY 1953 AmzQcqckeG Wc-HIzgcQmu zgm xeQGz qHqmQ cY H zmWgYeaIm zgHz lHG WmYzQHk zc zgm Mmzgcu Ycl XYclY HG GeMIkHzmu HYYmHkeYr. LgeG kHYuMHQX qHqmQ Ggclmu zgm xeQGz YIMmQeWHk GeMIkHzecYG cx H keaIeu. Lgm HkrcQezgM xcQ rmYmQHzeYr GHMqkmG xQcM zgm CckznMHYY ueGzQeiIzecY lHG kHzmQ rmYmQHkenmu iB j.O. EHGzeYrG zc imWcMm zgm AmzQcqckeG-EHGzeYrG HkrcQezgM. Em eG WQmuezmu HG qHQz cx zgm zmHM zgHz WHMm Iq lezg zgm YHMm AcYzm yHQkc Mmzgcu eY QmxmQmYWm zc H WckkmHrIm'G QmkHzeJm'G kcJm xcQ zgm WHGeYcG cx AcYzm yHQkc. AcYzm yHQkc MmzgcuG HQm H WkHGG cx WcMqIzHzecYHk HkrcQezgMG zgHz QmkB cY QmqmHzmu QHYucM GHMqkeYr zc WcMqIzm zgmeQ QmGIkzG.
复制代码


大家有兴趣也可以自己动手试试看。

下面会分享我的解题思路和步骤。
小甲鱼最新课程 -> https://ilovefishc.com
回复

使用道具 举报

 楼主| 发表于 2017-2-24 11:24:30 | 显示全部楼层
一开始和同事看到这道题目,我们的想法是“穷举”把每个字母依次与其他字母替换,后来想想靠暴力破解基本上行不通,因为考虑大小写的话,总共有52个字母,每个字母依次与其他字母替换的话,总共有C52种排列,基本上是天文数字了,解不出来的。
小甲鱼最新课程 -> https://ilovefishc.com
回复 支持 反对

使用道具 举报

 楼主| 发表于 2017-2-24 11:30:14 | 显示全部楼层
那么“穷举法”就先被排除了,然后我们想到,既然是英语文章,那么就应该利用英语文章的特点,即根据英语单词的词频进行分析。

按照维基百科的“字母频率”中的说明:https://zh.wikipedia.org/wiki/字母频率
赫伯特·S·基姆在他那部经典的密码学入门著作 《密码和隐密写作》(Codes and Secret Writing)里提道:英文的字母频率排列顺序是ETAON RISHD LFCMU GYPWB VKJXQ Z,最常见的字母对是TH HE AN RE ER IN ON AT ND ST ES EN OF TE ED OR TI HI AS TO,最常见的连写字母对是LL EE SS OO TT FF RR NN PP CC。


既然有线索了,那么下一步就是统计字母的词频。
小甲鱼最新课程 -> https://ilovefishc.com
回复 支持 反对

使用道具 举报

 楼主| 发表于 2017-2-24 11:38:50 | 显示全部楼层
  1. pw = "pz ZcG pkHMcG, eY zgm 1950G, H rQcIq cx QmGmHQWgmQG kmu iB AmzQcqckeG, eYWkIueYr hcgY JcY vmIMHYY HYu tzHYeGkHl PkHM, umJmkcqmu zgm AcYzm yHQkc Mmzgcu. KmYmQHkkB GqmHXeYr, zgm AcYzm yHQkc Mmzgcu eG H GzHzeGzeWHk HqqQcHWg zc GckJm umzmQMeYeGzeW MHYB-icuB qQcikmMG. VY 1953 AmzQcqckeG Wc-HIzgcQmu zgm xeQGz qHqmQ cY H zmWgYeaIm zgHz lHG WmYzQHk zc zgm Mmzgcu Ycl XYclY HG GeMIkHzmu HYYmHkeYr. LgeG kHYuMHQX qHqmQ Ggclmu zgm xeQGz YIMmQeWHk GeMIkHzecYG cx H keaIeu. Lgm HkrcQezgM xcQ rmYmQHzeYr GHMqkmG xQcM zgm CckznMHYY ueGzQeiIzecY lHG kHzmQ rmYmQHkenmu iB j.O. EHGzeYrG zc imWcMm zgm AmzQcqckeG-EHGzeYrG HkrcQezgM. Em eG WQmuezmu HG qHQz cx zgm zmHM zgHz WHMm Iq lezg zgm YHMm AcYzm yHQkc Mmzgcu eY QmxmQmYWm zc H WckkmHrIm'G QmkHzeJm'G kcJm xcQ zgm WHGeYcG cx AcYzm yHQkc. AcYzm yHQkc MmzgcuG HQm H WkHGG cx WcMqIzHzecYHk HkrcQezgMG zgHz QmkB cY QmqmHzmu QHYucM GHMqkeYr zc WcMqIzm zgmeQ QmGIkzG."
  2. from collections import Counter
  3. cp = Counter(pw).most_common()
  4. print(cp)
复制代码

就得到了:
  1. [(' ', 140), ('m', 82), ('z', 70), ('H', 67), ('c', 65), ('Y', 47), ('G', 46), ('Q', 45), ('e', 44), ('k', 42), ('g', 32), ('M', 29), ('u', 23), ('q', 20), ('W', 18), ('I', 15), ('r', 14), ('x', 11), ('.', 9), ('A', 8), ('l', 7), ('i', 6), ('B', 6), (',', 5), ('J', 5), ('y', 5), ('X', 3), ('-', 3), ('E', 3), ('p', 2), ('1', 2), ('9', 2), ('5', 2), ('a', 2), ('L', 2), ('n', 2), ("'", 2), ('Z', 1), ('0', 1), ('h', 1), ('v', 1), ('t', 1), ('P', 1), ('K', 1), ('V', 1), ('3', 1), ('C', 1), ('j', 1), ('O', 1)]
复制代码

其中去掉“空格”以后,出现频率最高的依次就是“mzHc”,可以看到这几个字母的频率大大高于其他字符,应该是符合“维基百科”的推论的。“m”的频率最高毫无疑问应该是对应"e",但是其他几个字母的频率较接近就需要和“tao”进行排列组合的尝试了。
小甲鱼最新课程 -> https://ilovefishc.com
回复 支持 反对

使用道具 举报

 楼主| 发表于 2017-2-24 11:44:20 | 显示全部楼层
编写替换函数:
  1. def repl(txt,dic,lst): #txt为替换的原文,dic为替换字典,lst为替换顺序
  2.     for rep in lst:
  3.         lst1,lst2 = [],[] #因为替换时,仅两两字母相互替换,不能打乱其他字母的顺序,所以先把找到的字母存起来,最后一起替换这些字母,而不能一边找一边替换,不然会打乱原文。
  4.         for e in range(len(txt)):
  5.             if rep == txt[e]: lst1.append(e)
  6.             if dic[rep] == txt[e]: lst2.append(e)
  7.         for each in lst1:
  8.             txt = txt[:each]+dic[rep]+txt[each+1:]
  9.         for each in lst2:
  10.             txt = txt[:each]+dic[dic[rep]]+txt[each+1:]
  11.     return txt
复制代码
小甲鱼最新课程 -> https://ilovefishc.com
回复 支持 反对

使用道具 举报

 楼主| 发表于 2017-2-24 11:49:25 | 显示全部楼层
本帖最后由 jerryxjr1220 于 2017-2-24 11:55 编辑

第一次替换:
  1. dic = {'m':'e','z':'t','H':'a','c':'o',
  2.         'e':'m','t':'z','a':'H','o':'c'}
  3. lst = ['m','z','H','c']
  4. pw = repl(pw,dic,lst)
  5. print(pw)
复制代码

得到:
  1. pt ZoG pkaMoG, mY tge 1950G, a rQoIq ox QeGeaQWgeQG keu iB AetQoqokmG, mYWkIumYr hogY JoY veIMaYY aYu ztaYmGkal PkaM, ueJekoqeu tge AoYte yaQko Metgou. KeYeQakkB GqeaXmYr, tge AoYte yaQko Metgou mG a GtatmGtmWak aqqQoaWg to GokJe ueteQMmYmGtmW MaYB-iouB qQoikeMG. VY 1953 AetQoqokmG Wo-aItgoQeu tge xmQGt qaqeQ oY a teWgYmHIe tgat laG WeYtQak to tge Metgou Yol XYolY aG GmMIkateu aYYeakmYr. LgmG kaYuMaQX qaqeQ Ggoleu tge xmQGt YIMeQmWak GmMIkatmoYG ox a kmHImu. Lge akroQmtgM xoQ reYeQatmYr GaMqkeG xQoM tge CoktnMaYY umGtQmiItmoY laG kateQ reYeQakmneu iB j.O. EaGtmYrG to ieWoMe tge AetQoqokmG-EaGtmYrG akroQmtgM. Ee mG WQeumteu aG qaQt ox tge teaM tgat WaMe Iq lmtg tge YaMe AoYte yaQko Metgou mY QexeQeYWe to a WokkearIe'G QekatmJe'G koJe xoQ tge WaGmYoG ox AoYte yaQko. AoYte yaQko MetgouG aQe a WkaGG ox WoMqItatmoYak akroQmtgMG tgat QekB oY Qeqeateu QaYuoM GaMqkmYr to WoMqIte tgemQ QeGIktG.
复制代码
小甲鱼最新课程 -> https://ilovefishc.com
回复 支持 反对

使用道具 举报

 楼主| 发表于 2017-2-24 12:00:26 | 显示全部楼层
本帖最后由 jerryxjr1220 于 2017-2-24 13:28 编辑

考虑到文章中的2个数字1950和1953应该表示年份,那么很有可能就是“in the 1950”和“In 1953”(考虑大小写)。
所以对应关系:m对应i,Y对应n,g对应h,G对应s,V对应I

  1. pt ZoG pkaMoG, in the 1950G, a rQoVq ox QeGeaQWheQG keu mB AetQoqokiG, inWkVuinr gohn Jon veVMann anu ztaniGkal PkaM, ueJekoqeu the Aonte yaQko Methou. KeneQakkB GqeaXinr, the Aonte yaQko Methou iG a GtatiGtiWak aqqQoaWh to GokJe ueteQMiniGtiW ManB-mouB qQomkeMG. In 1953 AetQoqokiG Wo-aVthoQeu the xiQGt qaqeQ on a teWhniHVe that laG WentQak to the Methou nol Xnoln aG GiMVkateu anneakinr. LhiG kanuMaQX qaqeQ Gholeu the xiQGt nVMeQiWak GiMVkationG ox a kiHViu. Lhe akroQithM xoQ reneQatinr GaMqkeG xQoM the CoktYMann uiGtQimVtion laG kateQ reneQakiYeu mB j.O. EaGtinrG to meWoMe the AetQoqokiG-EaGtinrG akroQithM. Ee iG WQeuiteu aG qaQt ox the teaM that WaMe Vq lith the naMe Aonte yaQko Methou in QexeQenWe to a WokkearVe'G QekatiJe'G koJe xoQ the WaGinoG ox Aonte yaQko. Aonte yaQko MethouG aQe a WkaGG ox WoMqVtationak akroQithMG that QekB on Qeqeateu QanuoM GaMqkinr to WoMqVte theiQ QeGVktG.
复制代码
小甲鱼最新课程 -> https://ilovefishc.com
回复 支持 反对

使用道具 举报

 楼主| 发表于 2017-2-24 13:31:25 | 显示全部楼层
继续看倒数第二个单词theiQ,前4个字母已经确定了,那么最后一个应该是r,由此得出Q对应r,1950G应该是对应1950s,所以G对应s。
  1. pt Zos pkaMos, in the 1950s, a QroVq ox researWhers keu mB Aetroqokis, inWkVuinQ gohn Jon veVMann anu ztaniskal PkaM, ueJekoqeu the Aonte yarko Methou. KenerakkB sqeaXinQ, the Aonte yarko Methou is a statistiWak aqqroaWh to sokJe ueterMinistiW ManB-mouB qromkeMs. In 1953 Aetroqokis Wo-aVthoreu the xirst qaqer on a teWhniHVe that las Wentrak to the Methou nol Xnoln as siMVkateu anneakinQ. Lhis kanuMarX qaqer sholeu the xirst nVMeriWak siMVkations ox a kiHViu. Lhe akQorithM xor QeneratinQ saMqkes xroM the CoktYMann uistrimVtion las kater QenerakiYeu mB j.O. EastinQs to meWoMe the Aetroqokis-EastinQs akQorithM. Ee is Wreuiteu as qart ox the teaM that WaMe Vq lith the naMe Aonte yarko Methou in rexerenWe to a WokkeaQVe's rekatiJe's koJe xor the Wasinos ox Aonte yarko. Aonte yarko Methous are a Wkass ox WoMqVtationak akQorithMs that rekB on reqeateu ranuoM saMqkinQ to WoMqVte their resVkts.
复制代码
小甲鱼最新课程 -> https://ilovefishc.com
回复 支持 反对

使用道具 举报

 楼主| 发表于 2017-2-24 13:39:18 | 显示全部楼层
Lhis应该对应This,所以L对应T。that las 应该对应 that was,所以l对应w。
  1. pt Zos pkaMos, in the 1950s, a QroVq ox researWhers keu mB Aetroqokis, inWkVuinQ gohn Jon veVMann anu ztaniskaw PkaM, ueJekoqeu the Aonte yarko Methou. KenerakkB sqeaXinQ, the Aonte yarko Methou is a statistiWak aqqroaWh to sokJe ueterMinistiW ManB-mouB qromkeMs. In 1953 Aetroqokis Wo-aVthoreu the xirst qaqer on a teWhniHVe that was Wentrak to the Methou now Xnown as siMVkateu anneakinQ. This kanuMarX qaqer showeu the xirst nVMeriWak siMVkations ox a kiHViu. The akQorithM xor QeneratinQ saMqkes xroM the CoktYMann uistrimVtion was kater QenerakiYeu mB j.O. EastinQs to meWoMe the Aetroqokis-EastinQs akQorithM. Ee is Wreuiteu as qart ox the teaM that WaMe Vq with the naMe Aonte yarko Methou in rexerenWe to a WokkeaQVe's rekatiJe's koJe xor the Wasinos ox Aonte yarko. Aonte yarko Methous are a Wkass ox WoMqVtationak akQorithMs that rekB on reqeateu ranuoM saMqkinQ to WoMqVte their resVkts.
复制代码
小甲鱼最新课程 -> https://ilovefishc.com
回复 支持 反对

使用道具 举报

 楼主| 发表于 2017-2-24 13:45:06 | 显示全部楼层
Xnown应该对应know,X对应k。showeu应该对应showed,所以u对应d。
  1. pt Zos pXaMos, in the 1950s, a QroVq ox researWhers Xed mB AetroqoXis, inWXVdinQ gohn Jon veVMann and ztanisXaw PXaM, deJeXoqed the Aonte yarXo Method. KeneraXXB sqeakinQ, the Aonte yarXo Method is a statistiWaX aqqroaWh to soXJe deterMinistiW ManB-modB qromXeMs. In 1953 AetroqoXis Wo-aVthored the xirst qaqer on a teWhniHVe that was WentraX to the Method now known as siMVXated anneaXinQ. This XandMark qaqer showed the xirst nVMeriWaX siMVXations ox a XiHVid. The aXQorithM xor QeneratinQ saMqXes xroM the CoXtYMann distrimVtion was Xater QeneraXiYed mB j.O. EastinQs to meWoMe the AetroqoXis-EastinQs aXQorithM. Ee is Wredited as qart ox the teaM that WaMe Vq with the naMe Aonte yarXo Method in rexerenWe to a WoXXeaQVe's reXatiJe's XoJe xor the Wasinos ox Aonte yarXo. Aonte yarXo Methods are a WXass ox WoMqVtationaX aXQorithMs that reXB on reqeated randoM saMqXinQ to WoMqVte their resVXts.
复制代码
小甲鱼最新课程 -> https://ilovefishc.com
回复 支持 反对

使用道具 举报

 楼主| 发表于 2017-2-24 13:49:49 | 显示全部楼层
其余的解法类似,都是通过近似的单词进行匹配,比如reqeated对应repeated,q对应p;resVXts对应results,a QroVq ox researWhers对应a group of researchers等等。
最终得到:
  1. At Los Alamos, in the 1950s, a group of researchers led by Metropolis, including John von Qeumann and Ytanislaw Plam, developed the Monte Carlo method. Generally speaking, the Monte Carlo method is a statistical approach to solve deterministic many-body problems. In 1953 Metropolis co-authored the first paper on a technique that was central to the method now known as simulated annealing. This landmark paper showed the first numerical simulations of a liquid. The algorithm for generating samples from the Boltzmann distribution was later generalized by j.O. Hastings to become the Metropolis-Hastings algorithm. He is credited as part of the team that came up with the name Monte Carlo method in reference to a colleague's relative's love for the casinos of Monte Carlo. Monte Carlo methods are a class of computational algorithms that rely on repeated random sampling to compute their results.
复制代码
小甲鱼最新课程 -> https://ilovefishc.com
回复 支持 反对

使用道具 举报

 楼主| 发表于 2017-2-24 13:52:09 | 显示全部楼层
通过上述方法的替换后,基本上文章的大意已经能够看懂了,但是剩下几个字母由于匹配量太小,加上我的英语也不是太好,还是无从下手
小甲鱼最新课程 -> https://ilovefishc.com
回复 支持 反对

使用道具 举报

 楼主| 发表于 2017-2-24 13:56:38 | 显示全部楼层
后来,找到了这份英语报纸后一期的解答,正确的答案如下:
  1. At Los Alamos, in the 1950s, a group of researchers led by Metropolis, including John von Neumann and Stanislaw Ulam, developed the Monte Carlo method. Generally speaking, the Monte Carlo method is a statistical approach to solve deterministic many-body problems. In 1953 Metropolis co-authored the first paper on a technique that was central to the method now known as simulated annealing. This landmark paper showed the first numerical simulations of a liquid. The algorithm for generating samples from the Boltzmann distribution was later generalized by W.K. Hastings to become the Metropolis-Hastings algorithm. He is credited as part of the team that came up with the name Monte Carlo method in reference to a colleague's relative's love for the casinos of Monte Carlo. Monte Carlo methods are a class of computational algorithms that rely on repeated random sampling to compute their results.
复制代码

是一篇介绍蒙特卡洛方法的文章。

对比一下,我们破解后的文章:

  1. c = 0
  2. for i in range(len(pw)):
  3.     if pw[i] != og[i]: c += 1  #og为原文的解答
  4. print ('准确率: {0}% '.format((1-c/len(pw))*100))
复制代码

输出:
准确率: 99.44320712694878%

小甲鱼最新课程 -> https://ilovefishc.com
回复 支持 反对

使用道具 举报

发表于 2017-2-24 14:52:26 | 显示全部楼层
很不错,自己动脑思考相处与答案相似度这么高的答案已经很不错了
毕竟是老外玩的,有些词汇不一定是我们会用到的 还好是英文 如果是其他语言那就更加恶心了
小甲鱼最新课程 -> https://ilovefishc.com
回复 支持 反对

使用道具 举报

发表于 2017-2-24 19:13:17 | 显示全部楼层
666啊
虽然什么都看不懂
但是我还是知道你是大佬
小甲鱼最新课程 -> https://ilovefishc.com
回复 支持 反对

使用道具 举报

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

小黑屋|手机版|Archiver|鱼C工作室 ( 粤ICP备18085999号-1 | 粤公网安备 44051102000585号)

GMT+8, 2025-10-7 03:07

Powered by Discuz! X3.4

© 2001-2023 Discuz! Team.

快速回复 返回顶部 返回列表