30天尝试新事情

python3与python2迭代器的写法的区别

python • 李魔佛发表了文章 • 0 个评论 • 2938 次浏览 • 2019-06-26 11:22 • 来自相关话题

大部分相同，只是python2里面需要实现在类中实现next()方法，而python3里面需要实现__next__()方法。

附一个例子：
def iter_demo():

class DefineIter(object):

def __init__(self,length):
self.length = length
self.data = range(self.length)
self.index=0

def __iter__(self):
return self

def __next__(self):

if self.index >=self.length:
# return None
raise StopIteration

d = self.data[self.index]*50
self.index =self.index + 1

return d

a = DefineIter(10)
print(type(a))
for i in a:
print(i) 查看全部

大部分相同，只是python2里面需要实现在类中实现next()方法，而python3里面需要实现__next__()方法。

附一个例子：

def iter_demo():



    class DefineIter(object):



        def __init__(self,length):

            self.length = length

            self.data = range(self.length)

            self.index=0



        def __iter__(self):

            return self





        def __next__(self):



            if self.index >=self.length:

                # return None

                raise StopIteration



            d = self.data[self.index]*50

            self.index =self.index + 1



            return d



    a = DefineIter(10)

    print(type(a))

    for i in a:

        print(i)

PyCharm 快捷键快速插入当前时间

python • 李魔佛发表了文章 • 0 个评论 • 3773 次浏览 • 2019-06-26 09:18 • 来自相关话题

个人觉得这是一个非常常用的功能，不过需要自定义实现。

方式
通过 Live Template 快速添加时间

步骤
1、添加一个 Template Group 命名为 Common
2、添加一个 Live Template 设置如下
Abbreviation： time
Description ： current time
Template Text: $time$

Edit Variables -> Expresssion : date("yyyy-MM-dd HH:mm:ss")

3、让设置生效
Define->Everywhere

4、使用
输入 time 后按下tab键就能转换为当前时间了
查看全部

个人觉得这是一个非常常用的功能，不过需要自定义实现。

方式

通过 Live Template 快速添加时间



步骤

1、添加一个 Template Group 命名为 Common

2、添加一个 Live Template 设置如下

Abbreviation： time

Description ： current time

Template Text:  $time$



Edit Variables ->  Expresssion : date("yyyy-MM-dd HH:mm:ss")







3、让设置生效

Define->Everywhere



4、使用

输入 time 后 按下tab键 就能转换为当前时间了

深圳转债转股后不可以撤单

股票 • 李魔佛发表了文章 • 0 个评论 • 3068 次浏览 • 2019-06-25 09:12 • 来自相关话题

亲身经历，深圳转债转股后可以撤单操作，并显示已撤单，但是晚上正常转股了。
说明转股后也是不能撤单的。

有用tushare做实盘交易的大神吗？现在buy和sell都无法委托下单了怎么回事？

贡献

股票 • aka_12 回复了问题 • 2 人关注 • 2 个回复 • 6029 次浏览 • 2019-06-17 14:29 • 来自相关话题

修改easytrader国金证券的默认启动路径

量化交易-Ptrade-QMT • 李魔佛发表了文章 • 0 个评论 • 4585 次浏览 • 2019-06-17 10:23 • 来自相关话题

如果你的国金证券不是安装在默认路径的话，会无法启动。报错：

pywinauto.application.AppStartError: Could not create the process "C:\全能行证券交易终端\xiadan.exe"
Error returned by CreateProcess: (2, 'CreateProcess', '系统找不到指定的文件。')

看了配置文件，也是没有具体的参数可以修改，只好修改源代码。
别听到改源代码就害怕，只是需要改一行就可以了。

找到文件：
site-package\easytrader\config\client.py

找过这一行：
class GJ(CommonConfig):
DEFAULT_EXE_PATH = "C:\\Tool\\xiadan.exe"只要修改上面的路径就可以了。注意用双斜杠。
查看全部

如果你的国金证券不是安装在默认路径的话，会无法启动。报错：

pywinauto.application.AppStartError: Could not create the process "C:\全能行证券交易终端\xiadan.exe"
Error returned by CreateProcess: (2, 'CreateProcess', '系统找不到指定的文件。')

看了配置文件，也是没有具体的参数可以修改，只好修改源代码。
别听到改源代码就害怕，只是需要改一行就可以了。

找到文件：
site-package\easytrader\config\client.py

找过这一行：

class GJ(CommonConfig):

    DEFAULT_EXE_PATH = "C:\\Tool\\xiadan.exe"

只要修改上面的路径就可以了。注意用双斜杠。

conda无法在win10下用命令行切换虚拟环境

python • 李魔佛发表了文章 • 0 个评论 • 5123 次浏览 • 2019-06-11 10:04 • 来自相关话题

虚拟环境已经安装好了
然后在PowerShell下运行activate py2，没有任何反应。（powershell是win7后面系统的增强命令行）
后来使用系统原始的cmd命令行，在运行里面敲入cmd，然后重新执行activate py2，问题得到解决了。
原因是兼容问题。查看全部

虚拟环境已经安装好了
然后在PowerShell下运行activate py2，没有任何反应。（powershell是win7后面系统的增强命令行）
后来使用系统原始的cmd命令行，在运行里面敲入cmd，然后重新执行activate py2，问题得到解决了。
原因是兼容问题。

使用pymongo中的find_one_and_update出错：需要分片键

数据库 • 李魔佛发表了文章 • 0 个评论 • 5122 次浏览 • 2019-06-10 17:13 • 来自相关话题

错误信息如下： File "C:\ProgramData\Anaconda3\lib\site-packages\pymongo\helpers.py", line 155, in _check_command_response
raise OperationFailure(msg % errmsg, code, response)
pymongo.errors.OperationFailure: Query for sharded findAndModify must contain the shard key
2019-06-10 16:14:32 [scrapy.core.engine] INFO: Closing spider (finished)
2019-06-10 16:14:32 [scrapy.statscollectors] INFO: Dumping Scrapy stats:
需要在查询语句中把分片键也添加进去。
因为findOneModify只会找一个记录，但是到底在哪个分片的记录呢？因为不确定，所以才需要把shard加上去。

参考官方：
Targeted Operations vs. Broadcast Operations
Generally, the fastest queries in a sharded environment are those that mongos route to a single shard, using the shard key and the cluster meta data from the config server. These targeted operations use the shard key value to locate the shard or subset of shards that satisfy the query document.
For queries that don’t include the shard key, mongos must query all shards, wait for their responses and then return the result to the application. These “scatter/gather” queries can be long running operations.
Broadcast Operations
mongos instances broadcast queries to all shards for the collection unless the mongos can determine which shard or subset of shards stores this data.

After the mongos receives responses from all shards, it merges the data and returns the result document. The performance of a broadcast operation depends on the overall load of the cluster, as well as variables like network latency, individual shard load, and number of documents returned per shard. Whenever possible, favor operations that result in targeted operation over those that result in a broadcast operation.
Multi-update operations are always broadcast operations.
The updateMany() and deleteMany() methods are broadcast operations, unless the query document specifies the shard key in full.
Targeted Operations
mongos can route queries that include the shard key or the prefix of a compound shard key a specific shard or set of shards. mongos uses the shard key value to locate the chunk whose range includes the shard key value and directs the query at the shard containing that chunk.

For example, if the shard key is:
copy
{ a: 1, b: 1, c: 1 }

The mongos program can route queries that include the full shard key or either of the following shard key prefixes at a specific shard or set of shards:
copy
{ a: 1 }
{ a: 1, b: 1 }

All insertOne() operations target to one shard. Each document in the insertMany() array targets to a single shard, but there is no guarantee all documents in the array insert into a single shard.
All updateOne(), replaceOne() and deleteOne() operations must include the shard key or _id in the query document. MongoDB returns an error if these methods are used without the shard key or _id.
Depending on the distribution of data in the cluster and the selectivity of the query, mongos may still perform a broadcast operation to fulfill these queries.
Index Use
If the query does not include the shard key, the mongos must send the query to all shards as a “scatter/gather” operation. Each shard will, in turn, use either the shard key index or another more efficient index to fulfill the query.
If the query includes multiple sub-expressions that reference the fields indexed by the shard key and the secondary index, the mongos can route the queries to a specific shard and the shard will use the index that will allow it to fulfill most efficiently.
Sharded Cluster Security
Use Internal Authentication to enforce intra-cluster security and prevent unauthorized cluster components from accessing the cluster. You must start each mongod or mongos in the cluster with the appropriate security settings in order to enforce internal authentication.
See Deploy Sharded Cluster with Keyfile Access Control for a tutorial on deploying a secured shardedcluster.
Cluster Users
Sharded clusters support Role-Based Access Control (RBAC) for restricting unauthorized access to cluster data and operations. You must start each mongod in the cluster, including the config servers, with the --auth option in order to enforce RBAC. Alternatively, enforcing Internal Authentication for inter-cluster security also enables user access controls via RBAC.
With RBAC enforced, clients must specify a --username, --password, and --authenticationDatabase when connecting to the mongos in order to access cluster resources.
Each cluster has its own cluster users. These users cannot be used to access individual shards.
See Enable Access Control for a tutorial on enabling adding users to an RBAC-enabled MongoDB deployment. 查看全部

错误信息如下：

  File "C:\ProgramData\Anaconda3\lib\site-packages\pymongo\helpers.py", line 155, in _check_command_response

    raise OperationFailure(msg % errmsg, code, response)

pymongo.errors.OperationFailure: Query for sharded findAndModify must contain the shard key

2019-06-10 16:14:32 [scrapy.core.engine] INFO: Closing spider (finished)

2019-06-10 16:14:32 [scrapy.statscollectors] INFO: Dumping Scrapy stats:

需要在查询语句中把分片键也添加进去。
因为findOneModify只会找一个记录，但是到底在哪个分片的记录呢？因为不确定，所以才需要把shard加上去。

参考官方：

Targeted Operations vs. Broadcast Operations

Generally, the fastest queries in a sharded environment are those that mongos route to a single shard, using the shard key and the cluster meta data from the config server. These targeted operations use the shard key value to locate the shard or subset of shards that satisfy the query document.

For queries that don’t include the shard key, mongos must query all shards, wait for their responses and then return the result to the application. These “scatter/gather” queries can be long running operations.

Broadcast Operations

mongos instances broadcast queries to all shards for the collection unless the mongos can determine which shard or subset of shards stores this data.

After the mongos receives responses from all shards, it merges the data and returns the result document. The performance of a broadcast operation depends on the overall load of the cluster, as well as variables like network latency, individual shard load, and number of documents returned per shard. Whenever possible, favor operations that result in targeted operation over those that result in a broadcast operation.

Multi-update operations are always broadcast operations.

The updateMany() and deleteMany() methods are broadcast operations, unless the query document specifies the shard key in full.

Targeted Operations

mongos can route queries that include the shard key or the prefix of a compound shard key a specific shard or set of shards. mongos uses the shard key value to locate the chunk whose range includes the shard key value and directs the query at the shard containing that chunk.

For example, if the shard key is:

copy

{ a: 1, b: 1, c: 1 }

The mongos program can route queries that include the full shard key or either of the following shard key prefixes at a specific shard or set of shards:

copy

{ a: 1 }

{ a: 1, b: 1 }

All insertOne() operations target to one shard. Each document in the insertMany() array targets to a single shard, but there is no guarantee all documents in the array insert into a single shard.

All updateOne(), replaceOne() and deleteOne() operations must include the shard key or _id in the query document. MongoDB returns an error if these methods are used without the shard key or _id.

Depending on the distribution of data in the cluster and the selectivity of the query, mongos may still perform a broadcast operation to fulfill these queries.

Index Use

If the query does not include the shard key, the mongos must send the query to all shards as a “scatter/gather” operation. Each shard will, in turn, use either the shard key index or another more efficient index to fulfill the query.

If the query includes multiple sub-expressions that reference the fields indexed by the shard key and the secondary index, the mongos can route the queries to a specific shard and the shard will use the index that will allow it to fulfill most efficiently.

Sharded Cluster Security

Use Internal Authentication to enforce intra-cluster security and prevent unauthorized cluster components from accessing the cluster. You must start each mongod or mongos in the cluster with the appropriate security settings in order to enforce internal authentication.

See Deploy Sharded Cluster with Keyfile Access Control for a tutorial on deploying a secured shardedcluster.

Cluster Users

Sharded clusters support Role-Based Access Control (RBAC) for restricting unauthorized access to cluster data and operations. You must start each mongod in the cluster, including the config servers, with the --auth option in order to enforce RBAC. Alternatively, enforcing Internal Authentication for inter-cluster security also enables user access controls via RBAC.

With RBAC enforced, clients must specify a --username, --password, and --authenticationDatabase when connecting to the mongos in order to access cluster resources.

Each cluster has its own cluster users. These users cannot be used to access individual shards.

See Enable Access Control for a tutorial on enabling adding users to an RBAC-enabled MongoDB deployment.

jupyter notebook格式的文件损坏如何修复

python • 李魔佛发表了文章 • 0 个评论 • 4381 次浏览 • 2019-06-08 13:44 • 来自相关话题

有时候用git同步时，造成了冲突后合并，jupyter notebook的文件被插入了诸如>>>>>HEAD，ORIGIN等字符，这时候再打开jupyter notebook文件（.ipynb后缀），会无法打开。修复过程：

使用下面的代码：
# 拯救损坏的jupyter 文件
import re
import codecs

pattern = re.compile('"source": \[(.*?)\]\s+\},',re.S)
filename = 'tushare_usage.ipynb'
with codecs.open(filename,encoding='utf8') as f:
content = f.read()

source = pattern.findall(content)
for s in source:
t=s.replace('\\n','')
t=re.sub('"','',t)
t=re.sub('(,$)','',t)
print(t)只要把你要修复的文件替换一下就可以了。查看全部

有时候用git同步时，造成了冲突后合并，jupyter notebook的文件被插入了诸如>>>>>HEAD，ORIGIN等字符，这时候再打开jupyter notebook文件（.ipynb后缀），会无法打开。修复过程：

使用下面的代码：

# 拯救损坏的jupyter 文件

import re

import codecs



pattern = re.compile('"source": \[(.*?)\]\s+\},',re.S)

filename = 'tushare_usage.ipynb'

with codecs.open(filename,encoding='utf8') as f:

    content = f.read()

    

source = pattern.findall(content)

for s in source:

    t=s.replace('\\n','')

    t=re.sub('"','',t)

    t=re.sub('(,$)','',t)

    print(t)

只要把你要修复的文件替换一下就可以了。

Warning: unable to run listCollections, attempting to approximate collection

数据库 • 李魔佛发表了文章 • 0 个评论 • 19017 次浏览 • 2019-06-07 17:35 • 来自相关话题

在mongodb中参数查看数据库中的表是报错：

Warning: unable to run listCollections, attempting to approximate collection names by parsing connectionStatus

那是因为设置了密码，但是没有进行认证导致的错误。这个错误为啥不直接说明原因呢。汗

直接: db.auth('admin','密码')
认证成功返回1，然后重新执行show tables就可以看到所有的表了。查看全部

在mongodb中参数查看数据库中的表是报错：

Warning: unable to run listCollections, attempting to approximate collection names by parsing connectionStatus

那是因为设置了密码，但是没有进行认证导致的错误。这个错误为啥不直接说明原因呢。汗

直接: db.auth('admin','密码')
认证成功返回1，然后重新执行show tables就可以看到所有的表了。

python连接mongodb集群 cluster

数据库 • 李魔佛发表了文章 • 0 个评论 • 4101 次浏览 • 2019-06-03 15:55 • 来自相关话题

网上资料比较少，自己测试了下。
连接方法如下：import pymongo
db = pymongo.MongoClient('mongodb://10.18.6.46,10.18.6.26,10.18.6.102')上面默认的端口do都是27017，如果是其他端口，需要这样修改：db = pymongo.MongoClient('mongodb://10.18.6.46:8888,10.18.6.26:9999,10.18.6.102:7777')
然后就可以正常读写数据库：

读：coll=db['testdb']['testcollection'].find()
for i in coll:
print(i)输出内容：{'_id': ObjectId('5cf4c7981ee9edff72e5c503'), 'username': 'hello'}
{'_id': ObjectId('5cf4c7991ee9edff72e5c504'), 'username': 'hello'}
{'_id': ObjectId('5cf4c7991ee9edff72e5c505'), 'username': 'hello'}
{'_id': ObjectId('5cf4c79a1ee9edff72e5c506'), 'username': 'hello'}
{'_id': ObjectId('5cf4c7b21ee9edff72e5c507'), 'username': 'hello world'}

写：collection = db['testdb']['testcollection']

for i in range(10):
collection.insert({'username':'huston{}'.format(i)})

原创文章，转载请注明出处：
http://30daydo.com/article/494
查看全部

网上资料比较少，自己测试了下。
连接方法如下：

import pymongo

db = pymongo.MongoClient('mongodb://10.18.6.46,10.18.6.26,10.18.6.102')

上面默认的端口do都是27017，如果是其他端口，需要这样修改：

db = pymongo.MongoClient('mongodb://10.18.6.46:8888,10.18.6.26:9999,10.18.6.102:7777')

然后就可以正常读写数据库：

读：

coll=db['testdb']['testcollection'].find()

for i in coll:

    print(i)

输出内容：

{'_id': ObjectId('5cf4c7981ee9edff72e5c503'), 'username': 'hello'}

{'_id': ObjectId('5cf4c7991ee9edff72e5c504'), 'username': 'hello'}

{'_id': ObjectId('5cf4c7991ee9edff72e5c505'), 'username': 'hello'}

{'_id': ObjectId('5cf4c79a1ee9edff72e5c506'), 'username': 'hello'}

{'_id': ObjectId('5cf4c7b21ee9edff72e5c507'), 'username': 'hello world'}

写：

collection = db['testdb']['testcollection']



for i in range(10):

    collection.insert({'username':'huston{}'.format(i)})

原创文章，转载请注明出处：
http://30daydo.com/article/494

请服务器推荐~~~阿里的这个主机太烂了

网络 • 李魔佛发起了问题 • 1 人关注 • 0 个回复 • 3819 次浏览 • 2019-05-26 18:51 • 来自相关话题

ElasticSearch查看已经存在的文档保存在哪个分片

数据库 • 李魔佛发表了文章 • 0 个评论 • 3669 次浏览 • 2019-05-26 12:54 • 来自相关话题

比如我有以下的文档：
{
"took" : 3,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 5,
"max_score" : 1.0,
"hits" : [
{
"_index" : "test",
"_type" : "mydoc",
"_id" : "XxyrM2kBVzdNcvl_GHv2",
"_score" : 1.0,
"_source" : {
"name" : "Shiled",
"twitter" : "Sonny sql is awesome",
"date" : "2018-12-27",
"id" : 1240,
"tags" : [
"red",
"shit"
]
}
},
{
"_index" : "test",
"_type" : "mydoc",
"_id" : "YByrM2kBVzdNcvl_tnvm",
"_score" : 1.0,
"_source" : {
"name" : "YYerk",
"twitter" : "sql is awesome",
"date" : "2008-12-27",
"id" : 12357,
"tags" : [
"red"
]
}
},
{
"_index" : "test",
"_type" : "mydoc",
"_id" : "7777",
"_score" : 1.0,
"_source" : {
"name" : "Rocky Chen",
"twitter" : "sql is awesome",
"date" : "2008-12-27",
"id" : 9999
}
},
{
"_index" : "test",
"_type" : "mydoc",
"_id" : "YhzDN2kBVzdNcvl_enuT",
"_score" : 1.0,
"_source" : {
"name" : "YYerk",
"twitter" : "sql is awesome",
"date" : "2008-12-27",
"id" : 888888,
"tags" : [
"red",
"green"
]
}
},
{
"_index" : "test",
"_type" : "mydoc",
"_id" : "YxzDN2kBVzdNcvl_u3th",
"_score" : 1.0,
"_source" : {
"name" : "YYerk",
"twitter" : "sql is awesome",
"date" : "2008-12-27",
"id" : 888888,
"tags" : [
"red",
"green"
]
}
}
]
}
}

如果我想看看id是 "_id" : "YxzDN2kBVzdNcvl_u3th",

这个文档是保存在哪个分片，如何查看？

引用：

路由一个文档到一个分片中编辑
当索引一个文档的时候，文档会被存储到一个主分片中。 Elasticsearch 如何知道一个文档应该存放到哪个分片中呢？当我们创建文档时，它如何决定这个文档应当被存储在分片 1 还是分片 2 中呢？
首先这肯定不会是随机的，否则将来要获取文档的时候我们就不知道从何处寻找了。实际上，这个过程是根据下面这个公式决定的：
shard = hash(routing) % number_of_primary_shards
routing 是一个可变值，默认是文档的 _id ，也可以设置成一个自定义的值。 routing 通过 hash 函数生成一个数字，然后这个数字再除以 number_of_primary_shards （主分片的数量）后得到余数。这个分布在 0 到 number_of_primary_shards-1 之间的余数，就是我们所寻求的文档所在分片的位置。
这就解释了为什么我们要在创建索引的时候就确定好主分片的数量并且永远不会改变这个数量：因为如果数量变化了，那么所有之前路由的值都会无效，文档也再也找不到了。

那么可以使用

GET test/_search_shards?routing=ID号来查看你要查询的id所在的分片

得到的结果：
{
"nodes" : {
"yl-qYmh1SXqzJsfI4d1ddw" : {
"name" : "node-3",
"ephemeral_id" : "UsJ9rFELTiCW07oHE9YMdg",
"transport_address" : "10.18.6.26:9300",
"attributes" : {
"ml.machine_memory" : "6088101888",
"rack" : "r1",
"ml.max_open_jobs" : "20",
"xpack.installed" : "true",
"ml.enabled" : "true"
}
},
"wT7wUd3iTkujYUsbVNVv-w" : {
"name" : "node-1",
"ephemeral_id" : "fP-vgSb0SdemnHDyaJUsWw",
"transport_address" : "10.18.6.102:9300",
"attributes" : {
"ml.machine_memory" : "8256720896",
"rack" : "r1",
"xpack.installed" : "true",
"ml.max_open_jobs" : "20",
"ml.enabled" : "true"
}
}
},
"indices" : {
"test" : { }
},
"shards" : [
[
{
"state" : "STARTED",
"primary" : true,
"node" : "wT7wUd3iTkujYUsbVNVv-w",
"relocating_node" : null,
"shard" : 1,
"index" : "test",
"allocation_id" : {
"id" : "k-8E4dL7QmGgwcsNsUCP6Q"
}
},
{
"state" : "STARTED",
"primary" : false,
"node" : "yl-qYmh1SXqzJsfI4d1ddw",
"relocating_node" : null,
"shard" : 1,
"index" : "test",
"allocation_id" : {
"id" : "lvOpQIKgRUibkulr3nRfEw"
}
}
]
]
}

只需要关注shards字段就可以，从上面可以看到，该文档存在shard 1 分片上。分别在node1和node3节点，一个是主分片，一个是副本分片查看全部

比如我有以下的文档：

{

  "took" : 3,

  "timed_out" : false,

  "_shards" : {

    "total" : 5,

    "successful" : 5,

    "skipped" : 0,

    "failed" : 0

  },

  "hits" : {

    "total" : 5,

    "max_score" : 1.0,

    "hits" : [

      {

        "_index" : "test",

        "_type" : "mydoc",

        "_id" : "XxyrM2kBVzdNcvl_GHv2",

        "_score" : 1.0,

        "_source" : {

          "name" : "Shiled",

          "twitter" : "Sonny sql is awesome",

          "date" : "2018-12-27",

          "id" : 1240,

          "tags" : [

            "red",

            "shit"

          ]

        }

      },

      {

        "_index" : "test",

        "_type" : "mydoc",

        "_id" : "YByrM2kBVzdNcvl_tnvm",

        "_score" : 1.0,

        "_source" : {

          "name" : "YYerk",

          "twitter" : "sql is awesome",

          "date" : "2008-12-27",

          "id" : 12357,

          "tags" : [

            "red"

          ]

        }

      },

      {

        "_index" : "test",

        "_type" : "mydoc",

        "_id" : "7777",

        "_score" : 1.0,

        "_source" : {

          "name" : "Rocky Chen",

          "twitter" : "sql is awesome",

          "date" : "2008-12-27",

          "id" : 9999

        }

      },

      {

        "_index" : "test",

        "_type" : "mydoc",

        "_id" : "YhzDN2kBVzdNcvl_enuT",

        "_score" : 1.0,

        "_source" : {

          "name" : "YYerk",

          "twitter" : "sql is awesome",

          "date" : "2008-12-27",

          "id" : 888888,

          "tags" : [

            "red",

            "green"

          ]

        }

      },

      {

        "_index" : "test",

        "_type" : "mydoc",

        "_id" : "YxzDN2kBVzdNcvl_u3th",

        "_score" : 1.0,

        "_source" : {

          "name" : "YYerk",

          "twitter" : "sql is awesome",

          "date" : "2008-12-27",

          "id" : 888888,

          "tags" : [

            "red",

            "green"

          ]

        }

      }

    ]

  }

}

如果我想看看id是 "_id" : "YxzDN2kBVzdNcvl_u3th",

这个文档是保存在哪个分片，如何查看？

引用：

路由一个文档到一个分片中编辑
当索引一个文档的时候，文档会被存储到一个主分片中。 Elasticsearch 如何知道一个文档应该存放到哪个分片中呢？当我们创建文档时，它如何决定这个文档应当被存储在分片 1 还是分片 2 中呢？
首先这肯定不会是随机的，否则将来要获取文档的时候我们就不知道从何处寻找了。实际上，这个过程是根据下面这个公式决定的：
shard = hash(routing) % number_of_primary_shards
routing 是一个可变值，默认是文档的 _id ，也可以设置成一个自定义的值。 routing 通过 hash 函数生成一个数字，然后这个数字再除以 number_of_primary_shards （主分片的数量）后得到余数。这个分布在 0 到 number_of_primary_shards-1 之间的余数，就是我们所寻求的文档所在分片的位置。
这就解释了为什么我们要在创建索引的时候就确定好主分片的数量并且永远不会改变这个数量：因为如果数量变化了，那么所有之前路由的值都会无效，文档也再也找不到了。

那么可以使用

GET test/_search_shards?routing=ID号来查看你要查询的id所在的分片

得到的结果：

{

  "nodes" : {

    "yl-qYmh1SXqzJsfI4d1ddw" : {

      "name" : "node-3",

      "ephemeral_id" : "UsJ9rFELTiCW07oHE9YMdg",

      "transport_address" : "10.18.6.26:9300",

      "attributes" : {

        "ml.machine_memory" : "6088101888",

        "rack" : "r1",

        "ml.max_open_jobs" : "20",

        "xpack.installed" : "true",

        "ml.enabled" : "true"

      }

    },

    "wT7wUd3iTkujYUsbVNVv-w" : {

      "name" : "node-1",

      "ephemeral_id" : "fP-vgSb0SdemnHDyaJUsWw",

      "transport_address" : "10.18.6.102:9300",

      "attributes" : {

        "ml.machine_memory" : "8256720896",

        "rack" : "r1",

        "xpack.installed" : "true",

        "ml.max_open_jobs" : "20",

        "ml.enabled" : "true"

      }

    }

  },

  "indices" : {

    "test" : { }

  },

  "shards" : [

    [

      {

        "state" : "STARTED",

        "primary" : true,

        "node" : "wT7wUd3iTkujYUsbVNVv-w",

        "relocating_node" : null,

        "shard" : 1,

        "index" : "test",

        "allocation_id" : {

          "id" : "k-8E4dL7QmGgwcsNsUCP6Q"

        }

      },

      {

        "state" : "STARTED",

        "primary" : false,

        "node" : "yl-qYmh1SXqzJsfI4d1ddw",

        "relocating_node" : null,

        "shard" : 1,

        "index" : "test",

        "allocation_id" : {

          "id" : "lvOpQIKgRUibkulr3nRfEw"

        }

      }

    ]

  ]

}

只需要关注shards字段就可以，从上面可以看到，该文档存在shard 1 分片上。分别在node1和node3节点，一个是主分片，一个是副本分片

elasticsearch在match查询里面使用了type字段报错

数据库 • 李魔佛发表了文章 • 0 个评论 • 12578 次浏览 • 2019-05-26 00:26 • 来自相关话题

POST get-together/_search
{
"query":
{
"match": {
"name": {
"type":"phrase",
"query":"enterprise london",
"slop":1
}}
},
"_source": "name"
}
报错：

{
"error": {
"root_cause": [
{
"type": "parsing_exception",
"reason": "[match] query does not support [type]",
"line": 6,
"col": 13
}
],
"type": "parsing_exception",
"reason": "[match] query does not support [type]",
"line": 6,
"col": 13
},
"status": 400
}

在6.x已经不支持在math里面使用type，
可以修改为以下语法：
POST get-together/_search
{
"query":
{
"match_phrase": {
"name": {

"query":"enterprise london",
"slop":1
}}
},
"_source": "name"
}
得到的效果是一致的：

{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 2,
"successful" : 2,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 1,
"max_score" : 1.3243701,
"hits" : [
{
"_index" : "get-together",
"_type" : "_doc",
"_id" : "5",
"_score" : 1.3243701,
"_source" : {
"name" : "Enterprise search London get-together"
}
}
]
}
} 查看全部

POST get-together/_search

{

  "query":

  {

    "match": {

      "name": {

     "type":"phrase", 

      "query":"enterprise london",

      "slop":1

    }}

  },

  "_source": "name"

}

报错：

{

  "error": {

    "root_cause": [

      {

        "type": "parsing_exception",

        "reason": "[match] query does not support [type]",

        "line": 6,

        "col": 13

      }

    ],

    "type": "parsing_exception",

    "reason": "[match] query does not support [type]",

    "line": 6,

    "col": 13

  },

  "status": 400

}

在6.x已经不支持在math里面使用type，
可以修改为以下语法：

POST get-together/_search

{

  "query":

  {

    "match_phrase": {

      "name": {

     

      "query":"enterprise london",

      "slop":1

    }}

  },

  "_source": "name"

}

得到的效果是一致的：

{

  "took" : 2,

  "timed_out" : false,

  "_shards" : {

    "total" : 2,

    "successful" : 2,

    "skipped" : 0,

    "failed" : 0

  },

  "hits" : {

    "total" : 1,

    "max_score" : 1.3243701,

    "hits" : [

      {

        "_index" : "get-together",

        "_type" : "_doc",

        "_id" : "5",

        "_score" : 1.3243701,

        "_source" : {

          "name" : "Enterprise search London get-together"

        }

      }

    ]

  }

}

elasticsearch 更新文档的坑

数据库 • 李魔佛发表了文章 • 2 个评论 • 9866 次浏览 • 2019-05-24 22:46 • 来自相关话题

POST cnbeta/doc/cUxO42oB9O-zF2ru-rs-/_update
{
"doc":{
"title":"中国操作系统"
}
}

那个body里面的”doc" 不能少
不然会报错：

{
"error": {
"root_cause": [
{
"type": "action_request_validation_exception",
"reason": "Validation Failed: 1: script or doc is missing;"
}
],
"type": "action_request_validation_exception",
"reason": "Validation Failed: 1: script or doc is missing;"
},
"status": 400
} 查看全部

POST cnbeta/doc/cUxO42oB9O-zF2ru-rs-/_update

{

  "doc":{

  "title":"中国操作系统"

  }

}

那个body里面的”doc" 不能少
不然会报错：

{
"error": {
"root_cause": [
{
"type": "action_request_validation_exception",
"reason": "Validation Failed: 1: script or doc is missing;"
}
],
"type": "action_request_validation_exception",
"reason": "Validation Failed: 1: script or doc is missing;"
},
"status": 400
}

关于懒人听书爬虫的请教

贡献

python爬虫 • b842619045 回复了问题 • 3 人关注 • 2 个回复 • 3917 次浏览 • 2019-05-22 23:04 • 来自相关话题

requests直接post图片文件

python爬虫 • 李魔佛发表了文章 • 0 个评论 • 3419 次浏览 • 2019-05-17 16:32 • 来自相关话题

代码如下：
file_path=r'9927_15562445086485238.png'
file=open(file_path, 'rb').read()
r=requests.post(url=code_url,data=file)
print(r.text) 查看全部

代码如下：

    file_path=r'9927_15562445086485238.png'

    file=open(file_path, 'rb').read()

    r=requests.post(url=code_url,data=file)

    print(r.text)

python的mixin类

python • 李魔佛发表了文章 • 0 个评论 • 2759 次浏览 • 2019-05-16 16:30 • 来自相关话题

A mixin is a limited form of multiple inheritance.

maxin类似多重继承的一种限制形式：
关于Python的Mixin模式

像C或C++这类语言都支持多重继承，一个子类可以有多个父类，这样的设计常被人诟病。因为继承应该是个”is-a”关系。比如轿车类继承交通工具类，因为轿车是一个(“is-a”)交通工具。一个物品不可能是多种不同的东西，因此就不应该存在多重继承。不过有没有这种情况，一个类的确是需要继承多个类呢？

答案是有，我们还是拿交通工具来举例子，民航飞机是一种交通工具，对于土豪们来说直升机也是一种交通工具。对于这两种交通工具，它们都有一个功能是飞行，但是轿车没有。所以，我们不可能将飞行功能写在交通工具这个父类中。但是如果民航飞机和直升机都各自写自己的飞行方法，又违背了代码尽可能重用的原则（如果以后飞行工具越来越多，那会出现许多重复代码）。怎么办，那就只好让这两种飞机同时继承交通工具以及飞行器两个父类，这样就出现了多重继承。这时又违背了继承必须是”is-a”关系。这个难题该怎么破？

不同的语言给出了不同的方法，让我们先来看下Java。Java提供了接口interface功能，来实现多重继承：public abstract class Vehicle {
}

public interface Flyable {
public void fly();
}

public class FlyableImpl implements Flyable {
public void fly() {
System.out.println("I am flying");
}
}

public class Airplane extends Vehicle implements Flyable {
private flyable;

public Airplane() {
flyable = new FlyableImpl();
}

public void fly() {
flyable.fly();
}
}

现在我们的飞机同时具有了交通工具及飞行器两种属性，而且我们不需要重写飞行器中的飞行方法，同时我们没有破坏单一继承的原则。飞机就是一种交通工具，可飞行的能力是是飞机的属性，通过继承接口来获取。

回到主题，Python语言可没有接口功能，但是它可以多重继承。那Python是不是就该用多重继承来实现呢？是，也不是。说是，因为从语法上看，的确是通过多重继承实现的。说不是，因为它的继承依然遵守”is-a”关系，从含义上看依然遵循单继承的原则。这个怎么理解呢？我们还是看例子吧。class Vehicle(object):
pass

class PlaneMixin(object):
def fly(self):
print 'I am flying'

class Airplane(Vehicle, PlaneMixin):
pass

可以看到，上面的Airplane类实现了多继承，不过它继承的第二个类我们起名为PlaneMixin，而不是Plane，这个并不影响功能，但是会告诉后来读代码的人，这个类是一个Mixin类。所以从含义上理解，Airplane只是一个Vehicle，不是一个Plane。这个Mixin，表示混入(mix-in)，它告诉别人，这个类是作为功能添加到子类中，而不是作为父类，它的作用同Java中的接口。

使用Mixin类实现多重继承要非常小心
首先它必须表示某一种功能，而不是某个物品，如同Java中的Runnable，Callable等

其次它必须责任单一，如果有多个功能，那就写多个Mixin类然后，它不依赖于子类的实现最后，子类即便没有继承这个Mixin类，也照样可以工作，就是缺少了某个功能。（比如飞机照样可以载客，就是不能飞了^_^）

原创文章，转载请注明出处
http://30daydo.com/article/480
查看全部

A mixin is a limited form of multiple inheritance.

maxin类似多重继承的一种限制形式：
关于Python的Mixin模式

像C或C++这类语言都支持多重继承，一个子类可以有多个父类，这样的设计常被人诟病。因为继承应该是个”is-a”关系。比如轿车类继承交通工具类，因为轿车是一个(“is-a”)交通工具。一个物品不可能是多种不同的东西，因此就不应该存在多重继承。不过有没有这种情况，一个类的确是需要继承多个类呢？

答案是有，我们还是拿交通工具来举例子，民航飞机是一种交通工具，对于土豪们来说直升机也是一种交通工具。对于这两种交通工具，它们都有一个功能是飞行，但是轿车没有。所以，我们不可能将飞行功能写在交通工具这个父类中。但是如果民航飞机和直升机都各自写自己的飞行方法，又违背了代码尽可能重用的原则（如果以后飞行工具越来越多，那会出现许多重复代码）。怎么办，那就只好让这两种飞机同时继承交通工具以及飞行器两个父类，这样就出现了多重继承。这时又违背了继承必须是”is-a”关系。这个难题该怎么破？

不同的语言给出了不同的方法，让我们先来看下Java。Java提供了接口interface功能，来实现多重继承：

public abstract class Vehicle {

}

 

public interface Flyable {

    public void fly();

}

 

public class FlyableImpl implements Flyable {

    public void fly() {

        System.out.println("I am flying");

    }

} 

 

public class Airplane extends Vehicle implements Flyable {

    private flyable;

 

    public Airplane() {

        flyable = new FlyableImpl();

    }

 

    public void fly() {

        flyable.fly();

    }

}

现在我们的飞机同时具有了交通工具及飞行器两种属性，而且我们不需要重写飞行器中的飞行方法，同时我们没有破坏单一继承的原则。飞机就是一种交通工具，可飞行的能力是是飞机的属性，通过继承接口来获取。

回到主题，Python语言可没有接口功能，但是它可以多重继承。那Python是不是就该用多重继承来实现呢？是，也不是。说是，因为从语法上看，的确是通过多重继承实现的。说不是，因为它的继承依然遵守”is-a”关系，从含义上看依然遵循单继承的原则。这个怎么理解呢？我们还是看例子吧。

class Vehicle(object):

    pass

 

class PlaneMixin(object):

    def fly(self):

        print 'I am flying'

 

class Airplane(Vehicle, PlaneMixin):

    pass

可以看到，上面的Airplane类实现了多继承，不过它继承的第二个类我们起名为PlaneMixin，而不是Plane，这个并不影响功能，但是会告诉后来读代码的人，这个类是一个Mixin类。所以从含义上理解，Airplane只是一个Vehicle，不是一个Plane。这个Mixin，表示混入(mix-in)，它告诉别人，这个类是作为功能添加到子类中，而不是作为父类，它的作用同Java中的接口。

使用Mixin类实现多重继承要非常小心

首先它必须表示某一种功能，而不是某个物品，如同Java中的Runnable，Callable等

其次它必须责任单一，如果有多个功能，那就写多个Mixin类
然后，它不依赖于子类的实现
最后，子类即便没有继承这个Mixin类，也照样可以工作，就是缺少了某个功能。（比如飞机照样可以载客，就是不能飞了^_^）

原创文章，转载请注明出处
http://30daydo.com/article/480

截止今天（2019-05-14）银行股今年的涨幅排名

股票 • 李魔佛发表了文章 • 0 个评论 • 4995 次浏览 • 2019-05-14 23:59 • 来自相关话题

今年涨幅最少的是农业银行，最多的是新股西安银行。 ticker secShortName secFullName y_chgPct
31 601288 农业银行中国农业银行股份有限公司 -0.341178
11 600015 华夏银行华夏银行股份有限公司 1.856174
45 601988 中国银行中国银行股份有限公司 2.248533
32 601328 交通银行交通银行股份有限公司 3.532657
30 601229 上海银行上海银行股份有限公司 3.725781
35 601398 工商银行中国工商银行股份有限公司 4.771403
40 601818 光大银行中国光大银行股份有限公司 5.643119
27 601169 北京银行北京银行股份有限公司 6.205580
14 600016 民生银行中国民生银行股份有限公司 7.092815
5 002936 郑州银行郑州银行股份有限公司 7.551112
49 601998 中信银行中信银行股份有限公司 8.181956
43 601939 建设银行中国建设银行股份有限公司 9.402651
41 601838 成都银行成都银行股份有限公司 9.424554
52 603323 苏农银行江苏苏州农村商业银行股份有限公司 12.375732
20 600926 杭州银行杭州银行股份有限公司 12.933645
8 600000 浦发银行上海浦东发展银行股份有限公司 14.752244
39 601577 长沙银行长沙银行股份有限公司 14.792683
18 600908 无锡银行无锡农村商业银行股份有限公司 16.181704
3 002807 江阴银行江苏江阴农村商业银行股份有限公司 19.274586
48 601997 贵阳银行贵阳银行股份有限公司 20.489563
4 002839 张家港行江苏张家港农村商业银行股份有限公司 20.599511
25 601166 兴业银行兴业银行股份有限公司 21.206503
24 601128 常熟银行江苏常熟农村商业银行股份有限公司 21.571187
19 600919 江苏银行江苏银行股份有限公司 23.218299
22 601009 南京银行南京银行股份有限公司 26.297500
16 600036 招商银行招商银行股份有限公司 27.518708
0 000001 平安银行平安银行股份有限公司 31.624747
2 002142 宁波银行宁波银行股份有限公司 31.729062
6 002948 青岛银行青岛银行股份有限公司 48.602573
7 002958 青农商行青岛农村商业银行股份有限公司 108.983776
42 601860 紫金银行江苏紫金农村商业银行股份有限公司 115.147347
21 600928 西安银行西安银行股份有限公司 128.496683 查看全部

今年涨幅最少的是农业银行，最多的是新股西安银行。

	ticker	secShortName	secFullName	y_chgPct

31	601288	农业银行	中国农业银行股份有限公司	-0.341178

11	600015	华夏银行	华夏银行股份有限公司	1.856174

45	601988	中国银行	中国银行股份有限公司	2.248533

32	601328	交通银行	交通银行股份有限公司	3.532657

30	601229	上海银行	上海银行股份有限公司	3.725781

35	601398	工商银行	中国工商银行股份有限公司	4.771403

40	601818	光大银行	中国光大银行股份有限公司	5.643119

27	601169	北京银行	北京银行股份有限公司	6.205580

14	600016	民生银行	中国民生银行股份有限公司	7.092815

5	002936	郑州银行	郑州银行股份有限公司	7.551112

49	601998	中信银行	中信银行股份有限公司	8.181956

43	601939	建设银行	中国建设银行股份有限公司	9.402651

41	601838	成都银行	成都银行股份有限公司	9.424554

52	603323	苏农银行	江苏苏州农村商业银行股份有限公司	12.375732

20	600926	杭州银行	杭州银行股份有限公司	12.933645

8	600000	浦发银行	上海浦东发展银行股份有限公司	14.752244

39	601577	长沙银行	长沙银行股份有限公司	14.792683

18	600908	无锡银行	无锡农村商业银行股份有限公司	16.181704

3	002807	江阴银行	江苏江阴农村商业银行股份有限公司	19.274586

48	601997	贵阳银行	贵阳银行股份有限公司	20.489563

4	002839	张家港行	江苏张家港农村商业银行股份有限公司	20.599511

25	601166	兴业银行	兴业银行股份有限公司	21.206503

24	601128	常熟银行	江苏常熟农村商业银行股份有限公司	21.571187

19	600919	江苏银行	江苏银行股份有限公司	23.218299

22	601009	南京银行	南京银行股份有限公司	26.297500

16	600036	招商银行	招商银行股份有限公司	27.518708

0	000001	平安银行	平安银行股份有限公司	31.624747

2	002142	宁波银行	宁波银行股份有限公司	31.729062

6	002948	青岛银行	青岛银行股份有限公司	48.602573

7	002958	青农商行	青岛农村商业银行股份有限公司	108.983776

42	601860	紫金银行	江苏紫金农村商业银行股份有限公司	115.147347

21	600928	西安银行	西安银行股份有限公司	128.496683

正则表达式替换中文换行符【python】

python爬虫 • 李魔佛发表了文章 • 0 个评论 • 2861 次浏览 • 2019-05-13 11:02 • 来自相关话题

js里面的内容有中文的换行符。
使用正则表达式替换换行符。（也可以替换为任意字符）js=re.sub('\r\n','',js)
完毕。

js里面的内容有中文的换行符。
使用正则表达式替换换行符。（也可以替换为任意字符）

js=re.sub('\r\n','',js)

完毕。

request header显示Provisional headers are shown

python爬虫 • 李魔佛发表了文章 • 0 个评论 • 4852 次浏览 • 2019-05-13 10:07 • 来自相关话题

出现这个情况，一般是因为装了一些插件，比如屏蔽广告的插件 ad block导致的。
把插件卸载了问题就解决了。

异步爬虫aiohttp post提交数据

python爬虫 • 李魔佛发表了文章 • 0 个评论 • 7728 次浏览 • 2019-05-08 16:40 • 来自相关话题

基本的用法：async def fetch(session,url, data):
async with session.post(url=url, data=data, headers=headers) as response:
return await response.json()
完整的例子：import aiohttp
import asyncio

page = 30

post_data = {
'page': 1,
'pageSize': 10,
'keyWord': '',
'dpIds': '',
}

headers = {

"Accept-Encoding": "gzip, deflate",
"Accept-Language": "en-US,en;q=0.9",
"User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.108 Safari/537.36",
"X-Requested-With": "XMLHttpRequest",
}

result=

async def fetch(session,url, data):
async with session.post(url=url, data=data, headers=headers) as response:
return await response.json()

async def parse(html):
xzcf_list = html.get('newtxzcfList')
if xzcf_list is None:
return
for i in xzcf_list:
result.append(i)

async def downlod(page):
data=post_data.copy()
data['page']=page
url = 'http://credit.chaozhou.gov.cn/tfieldTypeActionJson!initXzcfListnew.do'
async with aiohttp.ClientSession() as session:
html=await fetch(session,url,data)
await parse(html)

loop = asyncio.get_event_loop()
tasks=[asyncio.ensure_future(downlod(i)) for i in range(1,page)]
tasks=asyncio.gather(*tasks)
# print(tasks)
loop.run_until_complete(tasks)
# loop.close()
# print(result)
count=0
for i in result:
print(i.get('cfXdrMc'))
count+=1
print(f'total {count}') 查看全部

基本的用法：

async def fetch(session,url, data):

    async with session.post(url=url, data=data, headers=headers) as response:

        return await response.json()

完整的例子：

import aiohttp

import asyncio



page = 30



post_data = {

    'page': 1,

    'pageSize': 10,

    'keyWord': '',

    'dpIds': '',

}



headers = {

    

    "Accept-Encoding": "gzip, deflate",

    "Accept-Language": "en-US,en;q=0.9",

    "User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.108 Safari/537.36",

    "X-Requested-With": "XMLHttpRequest",

}



result=





async def fetch(session,url, data):

    async with session.post(url=url, data=data, headers=headers) as response:

        return await response.json()



async def parse(html):

    xzcf_list = html.get('newtxzcfList')

    if xzcf_list is None:

        return

    for i in xzcf_list:

        result.append(i)



async def downlod(page):

    data=post_data.copy()

    data['page']=page

    url = 'http://credit.chaozhou.gov.cn/tfieldTypeActionJson!initXzcfListnew.do'

    async with aiohttp.ClientSession() as session:

            html=await fetch(session,url,data)

            await parse(html)



loop = asyncio.get_event_loop()

tasks=[asyncio.ensure_future(downlod(i)) for i in range(1,page)]

tasks=asyncio.gather(*tasks)

# print(tasks)

loop.run_until_complete(tasks)

# loop.close()

# print(result)

count=0

for i in result:

    print(i.get('cfXdrMc'))

    count+=1

print(f'total {count}')

python异步aiohttp爬虫 - 异步爬取链家数据

python爬虫 • 李魔佛发表了文章 • 0 个评论 • 2727 次浏览 • 2019-05-08 15:52 • 来自相关话题

import requests
from lxml import etree
import asyncio
import aiohttp
import pandas
import re
import math
import time

loction_info = ''' 1→杭州
2→武汉
3→北京
按ENTER确认：'''
loction_select = input(loction_info)
loction_dic = {'1': 'hz',
'2': 'wh',
'3': 'bj'}
city_url = 'https://{}.lianjia.com/ershoufang/'.format(loction_dic[loction_select])
down = input('请输入价格下限（万）:')
up = input('请输入价格上限（万）:')

inter_list = [(int(down), int(up))]

def half_inter(inter):
lower = inter[0]
upper = inter[1]
delta = int((upper - lower) / 2)
inter_list.remove(inter)
print('已经缩小价格区间', inter)
inter_list.append((lower, lower + delta))
inter_list.append((lower + delta, upper))

pagenum = {}

def get_num(inter):
url = city_url + 'bp{}ep{}/'.format(inter[0], inter[1])
r = requests.get(url).text
print(r)
num = int(etree.HTML(r).xpath("//h2[@class='total fl']/span/text()")[0].strip())
pagenum[(inter[0], inter[1])] = num
return num

totalnum = get_num(inter_list[0])

judge = True
while judge:
a = [get_num(x) > 3000 for x in inter_list]
if True in a:
judge = True
else:
judge = False
for i in inter_list:
if get_num(i) > 3000:
half_inter(i)
print('价格区间缩小完毕！')

url_lst = []
url_lst_failed = []
url_lst_successed = []
url_lst_duplicated = []

for i in inter_list:
totalpage = math.ceil(pagenum[i] / 30)
for j in range(1, totalpage + 1):
url = city_url + 'pg{}bp{}ep{}/'.format(j, i[0], i[1])
url_lst.append(url)
print('url列表获取完毕！')

info_lst = []

async def get_info(url):
async with aiohttp.ClientSession() as session:
async with session.get(url, timeout=5) as resp:
if resp.status != 200:
url_lst_failed.append(url)
else:
url_lst_successed.append(url)
r = await resp.text()
nodelist = etree.HTML(r).xpath("//ul[@class='sellListContent']/li")
# print('-------------------------------------------------------------')
# print('开始抓取第{}个页面的数据,共计{}个页面'.format(url_lst.index(url),len(url_lst)))
# print('开始抓取第{}个页面的数据,共计{}个页面'.format(url_lst.index(url), len(url_lst)))
# print('开始抓取第{}个页面的数据,共计{}个页面'.format(url_lst.index(url), len(url_lst)))
# print('-------------------------------------------------------------')
info_dic = {}
index = 1
print('开始抓取{}'.format(resp.url))
print('开始抓取{}'.format(resp.url))
print('开始抓取{}'.format(resp.url))
for node in nodelist:
try:
info_dic['title'] = node.xpath(".//div[@class='title']/a/text()")[0]
except:
info_dic['title'] = '/'
try:
info_dic['href'] = node.xpath(".//div[@class='title']/a/@href")[0]
except:
info_dic['href'] = '/'
try:
info_dic['xiaoqu'] = \
node.xpath(".//div[@class='houseInfo']")[0].xpath('string(.)').replace(' ', '').split('|')[0]
except:
info_dic['xiaoqu'] = '/'
try:
info_dic['huxing'] = \
node.xpath(".//div[@class='houseInfo']")[0].xpath('string(.)').replace(' ', '').split('|')[1]
except:
info_dic['huxing'] = '/'
try:
info_dic['area'] = \
node.xpath(".//div[@class='houseInfo']")[0].xpath('string(.)').replace(' ', '').split('|')[2]
except:
info_dic['area'] = '/'
try:
info_dic['chaoxiang'] = \
node.xpath(".//div[@class='houseInfo']")[0].xpath('string(.)').replace(' ', '').split('|')[3]
except:
info_dic['chaoxiang'] = '/'
try:
info_dic['zhuangxiu'] = \
node.xpath(".//div[@class='houseInfo']")[0].xpath('string(.)').replace(' ', '').split('|')[4]
except:
info_dic['zhuangxiu'] = '/'
try:
info_dic['dianti'] = \
node.xpath(".//div[@class='houseInfo']")[0].xpath('string(.)').replace(' ', '').split('|')[5]
except:
info_dic['dianti'] = '/'
try:
info_dic['louceng'] = re.findall('$(.*)$', node.xpath(".//div[@class='positionInfo']/text()")[0])
except:
info_dic['louceng'] = '/'
try:
info_dic['nianxian'] = re.findall('\)(.*?)年', node.xpath(".//div[@class='positionInfo']/text()")[0])
except:
info_dic['nianxian'] = '/'
try:
info_dic['guanzhu'] = ''.join(re.findall('[0-9]', node.xpath(".//div[@class='followInfo']/text()")[
0].replace(' ', '').split('/')[0]))
except:
info_dic['guanzhu'] = '/'
try:
info_dic['daikan'] = ''.join(re.findall('[0-9]',
node.xpath(".//div[@class='followInfo']/text()")[0].replace(
' ', '').split('/')[1]))
except:
info_dic['daikan'] = '/'
try:
info_dic['fabu'] = node.xpath(".//div[@class='followInfo']/text()")[0].replace(' ', '').split('/')[
2]
except:
info_dic['fabu'] = '/'
try:
info_dic['totalprice'] = node.xpath(".//div[@class='totalPrice']/span/text()")[0]
except:
info_dic['totalprice'] = '/'
try:
info_dic['unitprice'] = node.xpath(".//div[@class='unitPrice']/span/text()")[0].replace('单价', '')
except:
info_dic['unitprice'] = '/'
if True in [info_dic['href'] in dic.values() for dic in info_lst]:
url_lst_duplicated.append(info_dic)
else:
info_lst.append(info_dic)
print('第{}条: {}→房屋信息抓取完毕！'.format(index, info_dic['title']))
index += 1
info_dic = {}

start = time.time()

# 首次抓取url_lst中的信息，部分url没有对其发起请求，不知道为什么
tasks = [asyncio.ensure_future(get_info(url)) for url in url_lst]
loop = asyncio.get_event_loop()
loop.run_until_complete(asyncio.wait(tasks))

# 将没有发起请求的url放入一个列表，对其进行循环抓取，直到所有url都被发起请求
url_lst_unrequested = []
for url in url_lst:
if url not in url_lst_successed or url_lst_failed:
url_lst_unrequested.append(url)
while len(url_lst_unrequested) > 0:
tasks_unrequested = [asyncio.ensure_future(get_info(url)) for url in url_lst_unrequested]
loop.run_until_complete(asyncio.wait(tasks_unrequested))
url_lst_unrequested = []
for url in url_lst:
if url not in url_lst_successed:
url_lst_unrequested.append(url)
end = time.time()
print('当前价格区间段内共有{}套二手房源$包含{}条重复房源$,实际获得{}条房源信息。'.format(totalnum, len(url_lst_duplicated), len(info_lst)))
print('总共耗时{}秒'.format(end - start))

df = pandas.DataFrame(info_lst)
df.to_csv("ljwh.csv", encoding='gbk') 查看全部

import requests

from lxml import etree

import asyncio

import aiohttp

import pandas

import re

import math

import time



loction_info = '''    1→杭州

    2→武汉

    3→北京

    按ENTER确认：'''

loction_select = input(loction_info)

loction_dic = {'1': 'hz',

               '2': 'wh',

               '3': 'bj'}

city_url = 'https://{}.lianjia.com/ershoufang/'.format(loction_dic[loction_select])

down = input('请输入价格下限（万）:')

up = input('请输入价格上限（万）:')



inter_list = [(int(down), int(up))]





def half_inter(inter):

    lower = inter[0]

    upper = inter[1]

    delta = int((upper - lower) / 2)

    inter_list.remove(inter)

    print('已经缩小价格区间', inter)

    inter_list.append((lower, lower + delta))

    inter_list.append((lower + delta, upper))





pagenum = {}





def get_num(inter):

    url = city_url + 'bp{}ep{}/'.format(inter[0], inter[1])

    r = requests.get(url).text

    print(r)

    num = int(etree.HTML(r).xpath("//h2[@class='total fl']/span/text()")[0].strip())

    pagenum[(inter[0], inter[1])] = num

    return num





totalnum = get_num(inter_list[0])



judge = True

while judge:

    a = [get_num(x) > 3000 for x in inter_list]

    if True in a:

        judge = True

    else:

        judge = False

    for i in inter_list:

        if get_num(i) > 3000:

            half_inter(i)

print('价格区间缩小完毕！')



url_lst = []

url_lst_failed = []

url_lst_successed = []

url_lst_duplicated = []



for i in inter_list:

    totalpage = math.ceil(pagenum[i] / 30)

    for j in range(1, totalpage + 1):

        url = city_url + 'pg{}bp{}ep{}/'.format(j, i[0], i[1])

        url_lst.append(url)

print('url列表获取完毕！')



info_lst = []





async def get_info(url):

    async with aiohttp.ClientSession() as session:

        async with session.get(url, timeout=5) as resp:

            if resp.status != 200:

                url_lst_failed.append(url)

            else:

                url_lst_successed.append(url)

            r = await resp.text()

            nodelist = etree.HTML(r).xpath("//ul[@class='sellListContent']/li")

            # print('-------------------------------------------------------------')

            # print('开始抓取第{}个页面的数据,共计{}个页面'.format(url_lst.index(url),len(url_lst)))

            # print('开始抓取第{}个页面的数据,共计{}个页面'.format(url_lst.index(url), len(url_lst)))

            # print('开始抓取第{}个页面的数据,共计{}个页面'.format(url_lst.index(url), len(url_lst)))

            # print('-------------------------------------------------------------')

            info_dic = {}

            index = 1

            print('开始抓取{}'.format(resp.url))

            print('开始抓取{}'.format(resp.url))

            print('开始抓取{}'.format(resp.url))

            for node in nodelist:

                try:

                    info_dic['title'] = node.xpath(".//div[@class='title']/a/text()")[0]

                except:

                    info_dic['title'] = '/'

                try:

                    info_dic['href'] = node.xpath(".//div[@class='title']/a/@href")[0]

                except:

                    info_dic['href'] = '/'

                try:

                    info_dic['xiaoqu'] = \

                    node.xpath(".//div[@class='houseInfo']")[0].xpath('string(.)').replace(' ', '').split('|')[0]

                except:

                    info_dic['xiaoqu'] = '/'

                try:

                    info_dic['huxing'] = \

                    node.xpath(".//div[@class='houseInfo']")[0].xpath('string(.)').replace(' ', '').split('|')[1]

                except:

                    info_dic['huxing'] = '/'

                try:

                    info_dic['area'] = \

                    node.xpath(".//div[@class='houseInfo']")[0].xpath('string(.)').replace(' ', '').split('|')[2]

                except:

                    info_dic['area'] = '/'

                try:

                    info_dic['chaoxiang'] = \

                    node.xpath(".//div[@class='houseInfo']")[0].xpath('string(.)').replace(' ', '').split('|')[3]

                except:

                    info_dic['chaoxiang'] = '/'

                try:

                    info_dic['zhuangxiu'] = \

                    node.xpath(".//div[@class='houseInfo']")[0].xpath('string(.)').replace(' ', '').split('|')[4]

                except:

                    info_dic['zhuangxiu'] = '/'

                try:

                    info_dic['dianti'] = \

                    node.xpath(".//div[@class='houseInfo']")[0].xpath('string(.)').replace(' ', '').split('|')[5]

                except:

                    info_dic['dianti'] = '/'

                try:

                    info_dic['louceng'] = re.findall('\((.*)\)', node.xpath(".//div[@class='positionInfo']/text()")[0])

                except:

                    info_dic['louceng'] = '/'

                try:

                    info_dic['nianxian'] = re.findall('\)(.*?)年', node.xpath(".//div[@class='positionInfo']/text()")[0])

                except:

                    info_dic['nianxian'] = '/'

                try:

                    info_dic['guanzhu'] = ''.join(re.findall('[0-9]', node.xpath(".//div[@class='followInfo']/text()")[

                        0].replace(' ', '').split('/')[0]))

                except:

                    info_dic['guanzhu'] = '/'

                try:

                    info_dic['daikan'] = ''.join(re.findall('[0-9]',

                                                            node.xpath(".//div[@class='followInfo']/text()")[0].replace(

                                                                ' ', '').split('/')[1]))

                except:

                    info_dic['daikan'] = '/'

                try:

                    info_dic['fabu'] = node.xpath(".//div[@class='followInfo']/text()")[0].replace(' ', '').split('/')[

                        2]

                except:

                    info_dic['fabu'] = '/'

                try:

                    info_dic['totalprice'] = node.xpath(".//div[@class='totalPrice']/span/text()")[0]

                except:

                    info_dic['totalprice'] = '/'

                try:

                    info_dic['unitprice'] = node.xpath(".//div[@class='unitPrice']/span/text()")[0].replace('单价', '')

                except:

                    info_dic['unitprice'] = '/'

                if True in [info_dic['href'] in dic.values() for dic in info_lst]:

                    url_lst_duplicated.append(info_dic)

                else:

                    info_lst.append(info_dic)

                print('第{}条:    {}→房屋信息抓取完毕！'.format(index, info_dic['title']))

                index += 1

                info_dic = {}





start = time.time()



# 首次抓取url_lst中的信息，部分url没有对其发起请求，不知道为什么

tasks = [asyncio.ensure_future(get_info(url)) for url in url_lst]

loop = asyncio.get_event_loop()

loop.run_until_complete(asyncio.wait(tasks))



# 将没有发起请求的url放入一个列表，对其进行循环抓取，直到所有url都被发起请求

url_lst_unrequested = []

for url in url_lst:

    if url not in url_lst_successed or url_lst_failed:

        url_lst_unrequested.append(url)

while len(url_lst_unrequested) > 0:

    tasks_unrequested = [asyncio.ensure_future(get_info(url)) for url in url_lst_unrequested]

    loop.run_until_complete(asyncio.wait(tasks_unrequested))

    url_lst_unrequested = []

    for url in url_lst:

        if url not in url_lst_successed:

            url_lst_unrequested.append(url)

end = time.time()

print('当前价格区间段内共有{}套二手房源\(包含{}条重复房源\),实际获得{}条房源信息。'.format(totalnum, len(url_lst_duplicated), len(info_lst)))

print('总共耗时{}秒'.format(end - start))



df = pandas.DataFrame(info_lst)

df.to_csv("ljwh.csv", encoding='gbk')

神经网络中数值梯度的计算 python代码

深度学习 • 李魔佛发表了文章 • 0 个评论 • 4233 次浏览 • 2019-05-07 19:12 • 来自相关话题

深度学习入门python

import matplotlib.pyplot as plt
import numpy as np
import time
from collections import OrderedDict

def softmax(a):
a = a - np.max(a)
exp_a = np.exp(a)
exp_a_sum = np.sum(exp_a)
return exp_a / exp_a_sum

def cross_entropy_error(t, y):
delta = 1e-7
s = -1 * np.sum(t * np.log(y + delta))
# print('cross entropy ',s)
return s

class simpleNet:
def __init__(self):
self.W = np.random.randn(2, 3)

def predict(self, x):
print('current w',self.W)
return np.dot(x, self.W)

def loss(self, x, t):
z = self.predict(x)
# print(z)
# print(z.ndim)
y = softmax(z)
# print('y',y)
loss = cross_entropy_error(y, t) # y为预测的值
return loss

def numerical_gradient_(f, x): # 针对2维的情况甚至是多维
h = 1e-4 # 0.0001
grad = np.zeros_like(x)

it = np.nditer(x, flags=['multi_index'], op_flags=['readwrite'])
while not it.finished:
idx = it.multi_index
print('idx', idx)
tmp_val = x[idx]
x[idx] = float(tmp_val) + h
fxh1 = f(x) # f(x+h)
print('fxh1 ', fxh1)
# print('current W', net.W)
x[idx] = tmp_val - h
fxh2 = f(x) # f(x-h)
print('fxh2 ', fxh2)
# print('next currnet W ', net.W)
grad[idx] = (fxh1 - fxh2) / (2 * h)

x[idx] = tmp_val # 还原值
it.iternext()

return grad

net = simpleNet()
x=np.array([0.6,0.9])
t = np.array([0.0,0.0,1.0])

def f(W):
return net.loss(x,t)

grads =numerical_gradient_(f,net.W)
print(grads) 查看全部

深度学习入门python

import matplotlib.pyplot as plt

import numpy as np

import time

from collections import OrderedDict



def softmax(a):

    a = a - np.max(a)

    exp_a = np.exp(a)

    exp_a_sum = np.sum(exp_a)

    return exp_a / exp_a_sum





def cross_entropy_error(t, y):

    delta = 1e-7

    s = -1 * np.sum(t * np.log(y + delta))

    #     print('cross entropy ',s)

    return s





class simpleNet:

    def __init__(self):

        self.W = np.random.randn(2, 3)



    def predict(self, x):

        print('current w',self.W)

        return np.dot(x, self.W)



    def loss(self, x, t):

        z = self.predict(x)

        #         print(z)

        #         print(z.ndim)

        y = softmax(z)

        #         print('y',y)

        loss = cross_entropy_error(y, t)  # y为预测的值

        return loss







def numerical_gradient_(f, x):  # 针对2维的情况 甚至是多维

    h = 1e-4  # 0.0001

    grad = np.zeros_like(x)



    it = np.nditer(x, flags=['multi_index'], op_flags=['readwrite'])

    while not it.finished:

        idx = it.multi_index

        print('idx', idx)

        tmp_val = x[idx]

        x[idx] = float(tmp_val) + h

        fxh1 = f(x)  # f(x+h)

        print('fxh1 ', fxh1)

        # print('current W', net.W)

        x[idx] = tmp_val - h

        fxh2 = f(x)  # f(x-h)

        print('fxh2 ', fxh2)

        # print('next currnet W ', net.W)

        grad[idx] = (fxh1 - fxh2) / (2 * h)



        x[idx] = tmp_val  # 还原值

        it.iternext()



    return grad







net = simpleNet()

x=np.array([0.6,0.9])

t = np.array([0.0,0.0,1.0])



def f(W):

    return net.loss(x,t)



grads =numerical_gradient_(f,net.W)

print(grads)

【可转债剩余转股比例数据排序】【2019-05-06】

股票 • 李魔佛发表了文章 • 0 个评论 • 5470 次浏览 • 2019-05-06 15:28 • 来自相关话题

数据如下：

剩余的比例越少，上市公司下调转股价的欲望就越少。也就是会任由可转债在那里晾着，不会积极拉正股。

数据定期更新。

原创文章，
转载请注明出处：
http://30daydo.com/article/472
查看全部

数据如下：

剩余的比例越少，上市公司下调转股价的欲望就越少。也就是会任由可转债在那里晾着，不会积极拉正股。

数据定期更新。

原创文章，
转载请注明出处：
http://30daydo.com/article/472

ElasticSearch配置集群无法发现节点问题【已解决】

数据库 • 李魔佛发表了文章 • 0 个评论 • 3928 次浏览 • 2019-05-05 10:00 • 来自相关话题

单个节点可以运行，但是配置为多个服务器集群的时候，总是提示无法发现服务器，花了点时间排查了问题，原来是配置文件的timeout问题，需要把timetout的值设置大一些，然后集群就可以发现到局域网中的其他节点。

修改文件elasticsearch.yml 文件中的timeout参数，改成原来值得10倍就可以了。查看全部

单个节点可以运行，但是配置为多个服务器集群的时候，总是提示无法发现服务器，花了点时间排查了问题，原来是配置文件的timeout问题，需要把timetout的值设置大一些，然后集群就可以发现到局域网中的其他节点。

修改文件elasticsearch.yml 文件中的timeout参数，改成原来值得10倍就可以了。

pycharm激活码有效期到2019年11月

闲聊 • 李魔佛发表了文章 • 0 个评论 • 2973 次浏览 • 2019-05-05 09:47 • 来自相关话题

pycharm专业版激活码如下，亲测有效，有效期到2019年11月7日
MTW881U3Z5-eyJsaWNlbnNlSWQiOiJNVFc4ODFVM1o1IiwibGljZW5zZWVOYW1lIjoiTnNzIEltIiwiYXNzaWduZWVOYW1lIjoiIiwiYXNzaWduZWVFbWFpbCI6IiIsImxpY2Vuc2VSZXN0cmljdGlvbiI6IkZvciBlZHVjYXRpb25hbCB1c2Ugb25seSIsImNoZWNrQ29uY3VycmVudFVzZSI6ZmFsc2UsInByb2R1Y3RzIjpbeyJjb2RlIjoiSUkiLCJwYWlkVXBUbyI6IjIwMTktMTEtMDYifSx7ImNvZGUiOiJBQyIsInBhaWRVcFRvIjoiMjAxOS0xMS0wNiJ9LHsiY29kZSI6IkRQTiIsInBhaWRVcFRvIjoiMjAxOS0xMS0wNiJ9LHsiY29kZSI6IlBTIiwicGFpZFVwVG8iOiIyMDE5LTExLTA2In0seyJjb2RlIjoiR08iLCJwYWlkVXBUbyI6IjIwMTktMTEtMDYifSx7ImNvZGUiOiJETSIsInBhaWRVcFRvIjoiMjAxOS0xMS0wNiJ9LHsiY29kZSI6IkNMIiwicGFpZFVwVG8iOiIyMDE5LTExLTA2In0seyJjb2RlIjoiUlMwIiwicGFpZFVwVG8iOiIyMDE5LTExLTA2In0seyJjb2RlIjoiUkMiLCJwYWlkVXBUbyI6IjIwMTktMTEtMDYifSx7ImNvZGUiOiJSRCIsInBhaWRVcFRvIjoiMjAxOS0xMS0wNiJ9LHsiY29kZSI6IlBDIiwicGFpZFVwVG8iOiIyMDE5LTExLTA2In0seyJjb2RlIjoiUk0iLCJwYWlkVXBUbyI6IjIwMTktMTEtMDYifSx7ImNvZGUiOiJXUyIsInBhaWRVcFRvIjoiMjAxOS0xMS0wNiJ9LHsiY29kZSI6IkRCIiwicGFpZFVwVG8iOiIyMDE5LTExLTA2In0seyJjb2RlIjoiREMiLCJwYWlkVXBUbyI6IjIwMTktMTEtMDYifSx7ImNvZGUiOiJSU1UiLCJwYWlkVXBUbyI6IjIwMTktMTEtMDYifV0sImhhc2giOiIxMDgyODE0Ni8wIiwiZ3JhY2VQZXJpb2REYXlzIjowLCJhdXRvUHJvbG9uZ2F0ZWQiOmZhbHNlLCJpc0F1dG9Qcm9sb25nYXRlZCI6ZmFsc2V9-aKyalfjUfiV5UXfhaMGgOqrMzTYy2rnsmobL47k8tTpR/jvG6HeL3FxxleetI+W+Anw3ZSe8QAMsSxqVS4podwlQgIe7f+3w7zyAT1j8HMVlfl2h96KzygdGpDSbwTbwOkJ6/5TQOPgAP86mkaSiM97KgvkZV/2nXQHRz1yhm+MT+OsioTwxDhd/22sSGq6KuIztZ03UvSciEmyrPdl2ueJw1WuT9YmFjdtTm9G7LuXvCM6eav+BgCRm+wwtUeDfoQqigbp0t6FQgkdQrcjoWvLSB0IUgp/f4qGf254fA7lXskT2VCFdDvi0jgxLyMVct1cKnPdM6fkHnbdSXKYDWw==-MIIElTCCAn2gAwIBAgIBCTANBgkqhkiG9w0BAQsFADAYMRYwFAYDVQQDDA1KZXRQcm9maWxlIENBMB4XDTE4MTEwMTEyMjk0NloXDTIwMTEwMjEyMjk0NlowaDELMAkGA1UEBhMCQ1oxDjAMBgNVBAgMBU51c2xlMQ8wDQYDVQQHDAZQcmFndWUxGTAXBgNVBAoMEEpldEJyYWlucyBzLnIuby4xHTAbBgNVBAMMFHByb2QzeS1mcm9tLTIwMTgxMTAxMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAxcQkq+zdxlR2mmRYBPzGbUNdMN6OaXiXzxIWtMEkrJMO/5oUfQJbLLuMSMK0QHFmaI37WShyxZcfRCidwXjot4zmNBKnlyHodDij/78TmVqFl8nOeD5+07B8VEaIu7c3E1N+e1doC6wht4I4+IEmtsPAdoaj5WCQVQbrI8KeT8M9VcBIWX7fD0fhexfg3ZRt0xqwMcXGNp3DdJHiO0rCdU+Itv7EmtnSVq9jBG1usMSFvMowR25mju2JcPFp1+I4ZI+FqgR8gyG8oiNDyNEoAbsR3lOpI7grUYSvkB/xVy/VoklPCK2h0f0GJxFjnye8NT1PAywoyl7RmiAVRE/EKwIDAQABo4GZMIGWMAkGA1UdEwQCMAAwHQYDVR0OBBYEFGEpG9oZGcfLMGNBkY7SgHiMGgTcMEgGA1UdIwRBMD+AFKOetkhnQhI2Qb1t4Lm0oFKLl/GzoRykGjAYMRYwFAYDVQQDDA1KZXRQcm9maWxlIENBggkA0myxg7KDeeEwEwYDVR0lBAwwCgYIKwYBBQUHAwEwCwYDVR0PBAQDAgWgMA0GCSqGSIb3DQEBCwUAA4ICAQAF8uc+YJOHHwOFcPzmbjcxNDuGoOUIP+2h1R75Lecswb7ru2LWWSUMtXVKQzChLNPn/72W0k+oI056tgiwuG7M49LXp4zQVlQnFmWU1wwGvVhq5R63Rpjx1zjGUhcXgayu7+9zMUW596Lbomsg8qVve6euqsrFicYkIIuUu4zYPndJwfe0YkS5nY72SHnNdbPhEnN8wcB2Kz+OIG0lih3yz5EqFhld03bGp222ZQCIghCTVL6QBNadGsiN/lWLl4JdR3lJkZzlpFdiHijoVRdWeSWqM4y0t23c92HXKrgppoSV18XMxrWVdoSM3nuMHwxGhFyde05OdDtLpCv+jlWf5REAHHA201pAU6bJSZINyHDUTB+Beo28rRXSwSh3OUIvYwKNVeoBY+KwOJ7WnuTCUq1meE6GkKc4D/cXmgpOyW/1SmBz3XjVIi/zprZ0zf3qH5mkphtg6ksjKgKjmx1cXfZAAX6wcDBNaCL+Ortep1Dh8xDUbqbBVNBL4jbiL3i3xsfNiyJgaZ5sX7i8tmStEpLbPwvHcByuf59qJhV/bZOl8KqJBETCDJcY6O2aqhTUy+9x93ThKs1GKrRPePrWPluud7ttlgtRveit/pcBrnQcXOl1rHq7ByB8CFAxNotRUYL9IF5n3wJOgkPojMy6jetQA5Ogc8Sm7RG6vg1yow==原创文章
转载请注明出处：
http://30daydo.com/article/470
查看全部

pycharm专业版激活码如下，亲测有效，有效期到2019年11月7日

MTW881U3Z5-eyJsaWNlbnNlSWQiOiJNVFc4ODFVM1o1IiwibGljZW5zZWVOYW1lIjoiTnNzIEltIiwiYXNzaWduZWVOYW1lIjoiIiwiYXNzaWduZWVFbWFpbCI6IiIsImxpY2Vuc2VSZXN0cmljdGlvbiI6IkZvciBlZHVjYXRpb25hbCB1c2Ugb25seSIsImNoZWNrQ29uY3VycmVudFVzZSI6ZmFsc2UsInByb2R1Y3RzIjpbeyJjb2RlIjoiSUkiLCJwYWlkVXBUbyI6IjIwMTktMTEtMDYifSx7ImNvZGUiOiJBQyIsInBhaWRVcFRvIjoiMjAxOS0xMS0wNiJ9LHsiY29kZSI6IkRQTiIsInBhaWRVcFRvIjoiMjAxOS0xMS0wNiJ9LHsiY29kZSI6IlBTIiwicGFpZFVwVG8iOiIyMDE5LTExLTA2In0seyJjb2RlIjoiR08iLCJwYWlkVXBUbyI6IjIwMTktMTEtMDYifSx7ImNvZGUiOiJETSIsInBhaWRVcFRvIjoiMjAxOS0xMS0wNiJ9LHsiY29kZSI6IkNMIiwicGFpZFVwVG8iOiIyMDE5LTExLTA2In0seyJjb2RlIjoiUlMwIiwicGFpZFVwVG8iOiIyMDE5LTExLTA2In0seyJjb2RlIjoiUkMiLCJwYWlkVXBUbyI6IjIwMTktMTEtMDYifSx7ImNvZGUiOiJSRCIsInBhaWRVcFRvIjoiMjAxOS0xMS0wNiJ9LHsiY29kZSI6IlBDIiwicGFpZFVwVG8iOiIyMDE5LTExLTA2In0seyJjb2RlIjoiUk0iLCJwYWlkVXBUbyI6IjIwMTktMTEtMDYifSx7ImNvZGUiOiJXUyIsInBhaWRVcFRvIjoiMjAxOS0xMS0wNiJ9LHsiY29kZSI6IkRCIiwicGFpZFVwVG8iOiIyMDE5LTExLTA2In0seyJjb2RlIjoiREMiLCJwYWlkVXBUbyI6IjIwMTktMTEtMDYifSx7ImNvZGUiOiJSU1UiLCJwYWlkVXBUbyI6IjIwMTktMTEtMDYifV0sImhhc2giOiIxMDgyODE0Ni8wIiwiZ3JhY2VQZXJpb2REYXlzIjowLCJhdXRvUHJvbG9uZ2F0ZWQiOmZhbHNlLCJpc0F1dG9Qcm9sb25nYXRlZCI6ZmFsc2V9-aKyalfjUfiV5UXfhaMGgOqrMzTYy2rnsmobL47k8tTpR/jvG6HeL3FxxleetI+W+Anw3ZSe8QAMsSxqVS4podwlQgIe7f+3w7zyAT1j8HMVlfl2h96KzygdGpDSbwTbwOkJ6/5TQOPgAP86mkaSiM97KgvkZV/2nXQHRz1yhm+MT+OsioTwxDhd/22sSGq6KuIztZ03UvSciEmyrPdl2ueJw1WuT9YmFjdtTm9G7LuXvCM6eav+BgCRm+wwtUeDfoQqigbp0t6FQgkdQrcjoWvLSB0IUgp/f4qGf254fA7lXskT2VCFdDvi0jgxLyMVct1cKnPdM6fkHnbdSXKYDWw==-MIIElTCCAn2gAwIBAgIBCTANBgkqhkiG9w0BAQsFADAYMRYwFAYDVQQDDA1KZXRQcm9maWxlIENBMB4XDTE4MTEwMTEyMjk0NloXDTIwMTEwMjEyMjk0NlowaDELMAkGA1UEBhMCQ1oxDjAMBgNVBAgMBU51c2xlMQ8wDQYDVQQHDAZQcmFndWUxGTAXBgNVBAoMEEpldEJyYWlucyBzLnIuby4xHTAbBgNVBAMMFHByb2QzeS1mcm9tLTIwMTgxMTAxMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAxcQkq+zdxlR2mmRYBPzGbUNdMN6OaXiXzxIWtMEkrJMO/5oUfQJbLLuMSMK0QHFmaI37WShyxZcfRCidwXjot4zmNBKnlyHodDij/78TmVqFl8nOeD5+07B8VEaIu7c3E1N+e1doC6wht4I4+IEmtsPAdoaj5WCQVQbrI8KeT8M9VcBIWX7fD0fhexfg3ZRt0xqwMcXGNp3DdJHiO0rCdU+Itv7EmtnSVq9jBG1usMSFvMowR25mju2JcPFp1+I4ZI+FqgR8gyG8oiNDyNEoAbsR3lOpI7grUYSvkB/xVy/VoklPCK2h0f0GJxFjnye8NT1PAywoyl7RmiAVRE/EKwIDAQABo4GZMIGWMAkGA1UdEwQCMAAwHQYDVR0OBBYEFGEpG9oZGcfLMGNBkY7SgHiMGgTcMEgGA1UdIwRBMD+AFKOetkhnQhI2Qb1t4Lm0oFKLl/GzoRykGjAYMRYwFAYDVQQDDA1KZXRQcm9maWxlIENBggkA0myxg7KDeeEwEwYDVR0lBAwwCgYIKwYBBQUHAwEwCwYDVR0PBAQDAgWgMA0GCSqGSIb3DQEBCwUAA4ICAQAF8uc+YJOHHwOFcPzmbjcxNDuGoOUIP+2h1R75Lecswb7ru2LWWSUMtXVKQzChLNPn/72W0k+oI056tgiwuG7M49LXp4zQVlQnFmWU1wwGvVhq5R63Rpjx1zjGUhcXgayu7+9zMUW596Lbomsg8qVve6euqsrFicYkIIuUu4zYPndJwfe0YkS5nY72SHnNdbPhEnN8wcB2Kz+OIG0lih3yz5EqFhld03bGp222ZQCIghCTVL6QBNadGsiN/lWLl4JdR3lJkZzlpFdiHijoVRdWeSWqM4y0t23c92HXKrgppoSV18XMxrWVdoSM3nuMHwxGhFyde05OdDtLpCv+jlWf5REAHHA201pAU6bJSZINyHDUTB+Beo28rRXSwSh3OUIvYwKNVeoBY+KwOJ7WnuTCUq1meE6GkKc4D/cXmgpOyW/1SmBz3XjVIi/zprZ0zf3qH5mkphtg6ksjKgKjmx1cXfZAAX6wcDBNaCL+Ortep1Dh8xDUbqbBVNBL4jbiL3i3xsfNiyJgaZ5sX7i8tmStEpLbPwvHcByuf59qJhV/bZOl8KqJBETCDJcY6O2aqhTUy+9x93ThKs1GKrRPePrWPluud7ttlgtRveit/pcBrnQcXOl1rHq7ByB8CFAxNotRUYL9IF5n3wJOgkPojMy6jetQA5Ogc8Sm7RG6vg1yow==

原创文章
转载请注明出处：
http://30daydo.com/article/470

numpy flatten函数的用法

量化交易-Ptrade-QMT • 李魔佛发表了文章 • 0 个评论 • 4891 次浏览 • 2019-04-30 10:01 • 来自相关话题

把数据展平,无论多少维的数据,变为1维

例子:
x=np.array([[1,2,3,4],[5,6,7,8]])x
array([[1, 2, 3, 4],
[5, 6, 7, 8]])
然后对x进行flatten操作
x.flatten()
得到的数据:
array([1, 2, 3, 4, 5, 6, 7, 8])
你也可以指定展平的轴,设定axis就可以了. 查看全部

把数据展平,无论多少维的数据,变为1维

例子:

x=np.array([[1,2,3,4],[5,6,7,8]])

x

array([[1, 2, 3, 4],

       [5, 6, 7, 8]])

然后对x进行flatten操作

x.flatten()

得到的数据:

array([1, 2, 3, 4, 5, 6, 7, 8])

你也可以指定展平的轴,设定axis就可以了.

发现numpy一个很坑的问题，要一定级别的高手才能发现问题

量化交易-Ptrade-QMT • 李魔佛发表了文章 • 0 个评论 • 4299 次浏览 • 2019-04-30 00:04 • 来自相关话题

一个二元一次方程：
y=X0**2+X1**2 # **2 是平方def function_2(x):
return x[0]**2+x[1]**2

下面是计算y的偏导数，分布计算X0和X1的偏导def numerical_gradient(f,x):
grad = np.zeros_like(x)
h=1e-4
for idx in range(x.size):
temp_v = x[idx]
x[idx]=temp_v+h
f1=f(x)
print(x,f1)
x[idx]=temp_v-h
f2=f(x)
print(x,f2)
ret = (f1-f2)/(2*h)
print(ret)
x[idx]=temp_v
grad[idx]=ret

return grad
然后调用numerical_gradient(function_2,np.array([3,4]))
计算的是二元一次方程 y=X0**2+X1**2 在点（3,4）的偏导的值
得到的是什么结果？
为什么会得到这样的结果？
小白一般要花点时间才能找到原因。
查看全部

一个二元一次方程：
y=X0**2+X1**2 # **2 是平方

def function_2(x):

    return x[0]**2+x[1]**2

下面是计算y的偏导数，分布计算X0和X1的偏导

def numerical_gradient(f,x):

    grad = np.zeros_like(x)

    h=1e-4

    for idx in range(x.size):

        temp_v = x[idx]

        x[idx]=temp_v+h

        f1=f(x)

        print(x,f1)

        x[idx]=temp_v-h

        f2=f(x)

        print(x,f2)

        ret = (f1-f2)/(2*h)

        print(ret)

        x[idx]=temp_v

        grad[idx]=ret

        

    return grad

然后调用

numerical_gradient(function_2,np.array([3,4]))

计算的是二元一次方程 y=X0**2+X1**2 在点（3,4）的偏导的值
得到的是什么结果？
为什么会得到这样的结果？
小白一般要花点时间才能找到原因。

numpy和dataframe轴的含义，axis为负数的含义

量化交易-Ptrade-QMT • 李魔佛发表了文章 • 0 个评论 • 5123 次浏览 • 2019-04-28 14:22 • 来自相关话题

比如有数组：
a=np.array([[[1,2],[3,4]],[[11,12],[13,14]]])
a
array([[[ 1, 2],
[ 3, 4]],

[[11, 12],
[13, 14]]])
a有3个中括号，那么就有3条轴，从0开始到2，分别是axis=0,1,2
那么我要对a进行求和,分别用axis=0,1,2进行运行。

a.sum(axis=0)得到：
array([[12, 14],
[16, 18]])意思是去掉一个中括号，然后运行。

同理：
a.sum(axis=1)对a去掉2个中括号，然后运行。
得到：
array([[ 4, 6],
[24, 26]])那么对a.sum(axis=2)的结果呢？读者可以自己上机去尝试吧。

而轴的负数，axis=-3和axis=0的意思是一样的，对于有3层轴的数组来说的话。

a.sum(axis=-3)
array([[12, 14],
[16, 18]])
查看全部

比如有数组：

a=np.array([[[1,2],[3,4]],[[11,12],[13,14]]])

a

array([[[ 1,  2],

        [ 3,  4]],



       [[11, 12],

        [13, 14]]])

a有3个中括号，那么就有3条轴，从0开始到2，分别是axis=0,1,2
那么我要对a进行求和,分别用axis=0,1,2进行运行。

a.sum(axis=0)

得到：

array([[12, 14],

       [16, 18]])

意思是去掉一个中括号，然后运行。

同理：

a.sum(axis=1)

对a去掉2个中括号，然后运行。
得到：

array([[ 4,  6],

       [24, 26]])

那么对a.sum(axis=2)的结果呢？读者可以自己上机去尝试吧。

而轴的负数，axis=-3和axis=0的意思是一样的，对于有3层轴的数组来说的话。

a.sum(axis=-3)

array([[12, 14],

       [16, 18]])

np.nonzero()的用法【numpy小白】

量化交易-Ptrade-QMT • 李魔佛发表了文章 • 0 个评论 • 5615 次浏览 • 2019-04-28 10:16 • 来自相关话题

numpy函数返回非零元素的位置。

返回值为元组，两个值分别为两个维度，包含了相应维度上非零元素的目录值。

比如：
n1=np.array([0,1,0,0,0,0,1,0,0,0,0,0,0,1])
n1.nonzero()
返回的是：
(array([ 1, 6, 13], dtype=int64),)注意上面是一个yu元组
要获取里面的值，需要用 n1.nonzero()[0] 来获取。

原创文章
转载请注明出处：
http://30daydo.com/article/466
查看全部

numpy函数返回非零元素的位置。

返回值为元组，两个值分别为两个维度，包含了相应维度上非零元素的目录值。

比如：

n1=np.array([0,1,0,0,0,0,1,0,0,0,0,0,0,1])

n1.nonzero()

返回的是：

(array([ 1,  6, 13], dtype=int64),)

注意上面是一个yu元组
要获取里面的值，需要用 n1.nonzero()[0] 来获取。

原创文章
转载请注明出处：
http://30daydo.com/article/466

通知设置新通知

发现